Humanity's Last Exam score in 2025?
7
10kṀ15k
2026
48 %
expected
95%
Above 10%
74%
Above 25%
45%
Above 50%
22%
Above 75%
14%
Above 90%

This market will resolve to the highest accuracy score (as a percentage) achieved by any AI model on the full, multi-modal Humanity's Last Exam at or before December 31, 2025, as reported on the official Scale AI leaderboard (https://scale.com/leaderboard/humanitys_last_exam) or other credible sources.

Background

Humanity's Last Exam is a challenging AI benchmark designed to test the limits of AI knowledge at the frontiers of human expertise. The exam consists of 3,000 questions across over 100 subjects, contributed by experts from over 500 institutions worldwide. As of early 2025, top-performing models include:

  • o1 (December 2024): 8.81% accuracy, 92.79% calibration error

  • Claude 3.7 Sonnet Thinking (February 2025): 8.93% accuracy

  • Gemini 2.0 Flash Thinking (January 2025): 7.22% accuracy, 90.58% calibration error

Other models like GPT-4o and Grok-2 have significantly lower accuracy scores, typically below 5%. The exam highlights the gap between current AI capabilities and expert-level human knowledge, with most models answering fewer than 10% of the questions correctly.

Get
Ṁ1,000
to start trading!
© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules