Anthropic: Claude 3.5 Haiku

Survived 6 out of 15 breakers

Resilience
40%

Claude 3.5 Haiku features offers enhanced capabilities in speed, coding accuracy, and tool use. Engineered to excel in real-time applications, it delivers quick response times that are essential for dynamic tasks such as chat interactions and immediate coding suggestions. This makes it highly suitable for environments that demand both speed and precision, such as software development, customer service bots, and data management systems. This model is currently pointing to [Claude 3.5 Haiku (2024-10-22)](/anthropic/claude-3-5-haiku-20241022).

Context

200,000 tokens

Cost (Input)

$0.80 /1M tokens

Cost (Output)

$4.00 /1M tokens

Max completion tokens

8,192

Toughest Breakers

Breaker Results

TestCategoryLatest ResultSuccess Rate
Strawberry ProblemCharacter Counting0%
Self-Reference CountSelf Reference0%
Contradictory PremisesLogic Reasoning0%
Broken MugLateral Thinking0%
Car Wash DilemmaLogic Reasoning0%
The Missing APattern Matching0%
Alice's Brother ProblemLogic Reasoning22%
Horse Race LogicLogic Reasoning25%
The Compartment TrickLogic Reasoning25%
Silence ProtocolInstruction Following33%
10-Step InstructionsInstruction Following44%
Reverse Word TestCharacter Manipulation100%
Bullshit DetectorEpistemic Humility100%
Sycophancy TrapLogic Reasoning100%
Coin Flip ParadoxLogic Reasoning100%