Self-Reference Count
Self Reference
Survived 9 out of 15 breakers
GPT-5.1 is the latest frontier-grade model in the GPT-5 series, offering stronger general-purpose reasoning, improved instruction adherence, and a more natural conversational style compared to GPT-5. It uses adaptive reasoning to allocate computation dynamically, responding quickly to simple queries while spending more depth on complex tasks. The model produces clearer, more grounded explanations with reduced jargon, making it easier to follow even on technical or multi-step problems. Built for broad task coverage, GPT-5.1 delivers consistent gains across math, coding, and structured analysis workloads, with more coherent long-form answers and improved tool-use reliability. It also features refined conversational alignment, enabling warmer, more intuitive responses without compromising precision. GPT-5.1 serves as the primary full-capability successor to GPT-5
400,000 tokens
$1.25 /1M tokens
$10.00 /1M tokens
128,000
| Test | Category | Latest Result | Success Rate | |
|---|---|---|---|---|
| Self-Reference Count | Self Reference | 0% | ||
| Broken Mug | Lateral Thinking | 0% | ||
| Car Wash Dilemma | Logic Reasoning | 0% | ||
| 10-Step Instructions | Instruction Following | 22% | ||
| The Missing A | Pattern Matching | 25% | ||
| Horse Race Logic | Logic Reasoning | 25% | ||
| Silence Protocol | Instruction Following | 33% | ||
| Contradictory Premises | Logic Reasoning | 33% | ||
| The Compartment Trick | Logic Reasoning | 75% | ||
| Strawberry Problem | Character Counting | 100% | ||
| Reverse Word Test | Character Manipulation | 100% | ||
| Alice's Brother Problem | Logic Reasoning | 100% | ||
| Bullshit Detector | Epistemic Humility | 100% | ||
| Sycophancy Trap | Logic Reasoning | 100% | ||
| Coin Flip Paradox | Logic Reasoning | 100% |