ReAIty Check
ModelsChallengesBenchmarksAbout
Submit Challenge
ModelsChallengesBenchmarksAboutSubmit Challenge
baidu

Baidu

1 model tracked

Average resilience
62%
Tests Survived

85

Tests Failed

53

Toughest Breakers

Self-Reference Count

Self Reference

#1
Pass rate (provider)
0%

10-Step Instructions

Instruction Following

#2
Pass rate (provider)
0%

Alice's Brother Problem

Logic Reasoning

#3
Pass rate (provider)
0%

Models

BE

Baidu: ERNIE 4.5 300B A47B

baidu

#1
Survived
62%
Failure Rate
38%

© 2026 ReAIty Check v0.5.27-beta by Eugene Tusmenko