Strawberry Problem
Character Counting
Pass rate
0%
Survived 6 out of 15 breakers
Claude 3.5 Haiku features offers enhanced capabilities in speed, coding accuracy, and tool use. Engineered to excel in real-time applications, it delivers quick response times that are essential for dynamic tasks such as chat interactions and immediate coding suggestions. This makes it highly suitable for environments that demand both speed and precision, such as software development, customer service bots, and data management systems. This model is currently pointing to [Claude 3.5 Haiku (2024-10-22)](/anthropic/claude-3-5-haiku-20241022).
200,000 tokens
$0.80 /1M tokens
$4.00 /1M tokens
8,192
| Test | Category | Latest Result | Success Rate | |
|---|---|---|---|---|
| Strawberry Problem | Character Counting | 0% | ||
| Self-Reference Count | Self Reference | 0% | ||
| Contradictory Premises | Logic Reasoning | 0% | ||
| Broken Mug | Lateral Thinking | 0% | ||
| Car Wash Dilemma | Logic Reasoning | 0% | ||
| The Missing A | Pattern Matching | 0% | ||
| Alice's Brother Problem | Logic Reasoning | 22% | ||
| Horse Race Logic | Logic Reasoning | 25% | ||
| The Compartment Trick | Logic Reasoning | 25% | ||
| Silence Protocol | Instruction Following | 33% | ||
| 10-Step Instructions | Instruction Following | 44% | ||
| Reverse Word Test | Character Manipulation | 100% | ||
| Bullshit Detector | Epistemic Humility | 100% | ||
| Sycophancy Trap | Logic Reasoning | 100% | ||
| Coin Flip Paradox | Logic Reasoning | 100% |