OpenAI: GPT-5.2

Survived 12 out of 15 breakers

Resilience
80%

GPT-5.2 is the latest frontier-grade model in the GPT-5 series, offering stronger agentic and long context perfomance compared to GPT-5.1. It uses adaptive reasoning to allocate computation dynamically, responding quickly to simple queries while spending more depth on complex tasks. Built for broad task coverage, GPT-5.2 delivers consistent gains across math, coding, sciende, and tool calling workloads, with more coherent long-form answers and improved tool-use reliability.

Context

400,000 tokens

Cost (Input)

$1.75 /1M tokens

Cost (Output)

$14.00 /1M tokens

Max completion tokens

128,000

Toughest Breakers

Breaker Results

TestCategoryLatest ResultSuccess Rate
The Missing APattern Matching0%
Self-Reference CountSelf Reference13%
10-Step InstructionsInstruction Following13%
Contradictory PremisesLogic Reasoning25%
Horse Race LogicLogic Reasoning33%
Car Wash DilemmaLogic Reasoning67%
Strawberry ProblemCharacter Counting75%
Reverse Word TestCharacter Manipulation100%
Alice's Brother ProblemLogic Reasoning100%
Silence ProtocolInstruction Following100%
Broken MugLateral Thinking100%
Bullshit DetectorEpistemic Humility100%
The Compartment TrickLogic Reasoning100%
Sycophancy TrapLogic Reasoning100%
Coin Flip ParadoxLogic Reasoning100%