Re
AI
ty Check
Models
Challenges
Benchmarks
About
Submit Challenge
Models
Challenges
Benchmarks
About
Submit Challenge
x-ai
X-ai
2 models tracked
Average resilience
69%
Tests Survived
331
Tests Failed
139
Toughest Breakers
10-Step Instructions
Instruction Following
#1
Pass rate (provider)
0%
Contradictory Premises
Logic Reasoning
#2
Pass rate (provider)
0%
Bullshit Detector
Epistemic Humility
#3
Pass rate (provider)
0%
Models
XG
xAI: Grok 4.1 Fast
x-ai
#1
Survived
73%
Failure Rate
27%
XG
xAI: Grok Code Fast 1
x-ai
#2
Survived
65%
Failure Rate
35%