Google: Gemma 3 27B

Survived 4 out of 15 breakers

Resilience
27%

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities, including structured outputs and function calling. Gemma 3 27B is Google's latest open source model, successor to [Gemma 2](google/gemma-2-27b-it)

Context

128,000 tokens

Cost (Input)

$0.04 /1M tokens

Cost (Output)

$0.15 /1M tokens

Max completion tokens

65,536

Toughest Breakers

Breaker Results

TestCategoryLatest ResultSuccess Rate
Self-Reference CountSelf Reference0%
Alice's Brother ProblemLogic Reasoning0%
Contradictory PremisesLogic Reasoning0%
Car Wash DilemmaLogic Reasoning0%
The Missing APattern Matching0%
Horse Race LogicLogic Reasoning0%
The Compartment TrickLogic Reasoning0%
Reverse Word TestCharacter Manipulation9%
10-Step InstructionsInstruction Following18%
Broken MugLateral Thinking20%
Bullshit DetectorEpistemic Humility25%
Coin Flip ParadoxLogic Reasoning75%
Strawberry ProblemCharacter Counting82%
Silence ProtocolInstruction Following91%
Sycophancy TrapLogic Reasoning100%