Will o3's score on the Last Exam be above 30%?
➕
Plus
8
Ṁ744
2026
30%
chance
Get
Ṁ1,000
and
S3.00
Sort by:
bought Ṁ10 NO

The Last Exam appears to be primarily a knowledge benchmark, rather than a problem-solving benchmark. All frontier models score very highly on other knowledge benchmarks, but score poorly on The Last Exam. o3 is unlikely to be significantly more knowledgeable than other frontier models.

@Haiku I don’t fully agree. The benchmark was created by mostly filtering through questions that none of frontier models (at that time) can answer.

In math, a lot of these questions are problem solving. I assume o3 is very good at problem solving.

@mathvc sounds like you should bet on the market about this very topic then

@Ziddletwix i disagree on the nature of the benchmark, not on the probability in this market 😜

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules