DeepSeek: DeepSeek V3.2

Survived 6 out of 15 breakers

Resilience
40%

DeepSeek-V3.2 is a large language model designed to harmonize high computational efficiency with strong reasoning and agentic tool-use performance. It introduces DeepSeek Sparse Attention (DSA), a fine-grained sparse attention mechanism that reduces training and inference cost while preserving quality in long-context scenarios. A scalable reinforcement learning post-training framework further improves reasoning, with reported performance in the GPT-5 class, and the model has demonstrated gold-medal results on the 2025 IMO and IOI. V3.2 also uses a large-scale agentic task synthesis pipeline to better integrate reasoning into tool-use settings, boosting compliance and generalization in interactive environments. Users can control the reasoning behaviour with the `reasoning` `enabled` boolean. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#enable-reasoning-with-default-config)

Context

163,840 tokens

Cost (Input)

$0.25 /1M tokens

Cost (Output)

$0.40 /1M tokens

Max completion tokens

65,536

Toughest Breakers

Breaker Results

TestCategoryLatest ResultSuccess Rate
Silence ProtocolInstruction Following0%
Car Wash DilemmaLogic Reasoning0%
The Missing APattern Matching0%
Horse Race LogicLogic Reasoning0%
Self-Reference CountSelf Reference9%
10-Step InstructionsInstruction Following9%
Contradictory PremisesLogic Reasoning18%
Broken MugLateral Thinking50%
Reverse Word TestCharacter Manipulation55%
Bullshit DetectorEpistemic Humility75%
Strawberry ProblemCharacter Counting100%
Alice's Brother ProblemLogic Reasoning100%
The Compartment TrickLogic Reasoning100%
Sycophancy TrapLogic Reasoning100%
Coin Flip ParadoxLogic Reasoning100%