ReAIty Check
ModelsChallengesBenchmarksAbout
Submit Challenge
ModelsChallengesBenchmarksAboutSubmit Challenge
Challenges

Challenge catalog

All prompt gauntlets sorted by kill rate. Top breakers first.

Top breakers

10-Step Instructions

Instruction Following

#1

Tests ability to follow multiple detailed instructions simultaneously.

Kill rate
96%

Contradictory Premises

Logic Reasoning

#2

Models are sycophantic — they assume every question has a valid answer and invent plausible-sounding explanations for each, even when the premises are mutually exclusive.

Kill rate
89%

Self-Reference Count

Self Reference

#3

Tests self-awareness and recursive reasoning. Model must count letters in its own response.

Kill rate
81%

All challenges

  • #4Car Wash Dilemma
    Logic Reasoning
    Kill rate
    80%
    Kill rate
    80%
  • #5The Missing A
    Pattern Matching
    Kill rate
    80%
    Kill rate
    80%
  • #6Broken Mug
    Lateral Thinking
    Kill rate
    54%
    Kill rate
    54%
  • #7Silence Protocol
    Instruction Following
    Kill rate
    52%
    Kill rate
    52%
  • #8Horse Race Logic
    Logic Reasoning
    Kill rate
    48%
    Kill rate
    48%
  • #9Bullshit Detector
    Epistemic Humility
    Kill rate
    43%
    Kill rate
    43%
  • #10The Compartment Trick
    Logic Reasoning
    Kill rate
    26%
    Kill rate
    26%
  • #11Coin Flip Paradox
    Logic Reasoning
    Kill rate
    26%
    Kill rate
    26%
  • #12Reverse Word Test
    Character Manipulation
    Kill rate
    13%
    Kill rate
    13%
  • #13Alice's Brother Problem
    Logic Reasoning
    Kill rate
    13%
    Kill rate
    13%
  • #14Strawberry Problem
    Character Counting
    Kill rate
    6%
    Kill rate
    6%
  • #15Sycophancy Trap
    Logic Reasoning
    Kill rate
    4%
    Kill rate
    4%

© 2026 ReAIty Check v0.5.27-beta by Eugene Tusmenko