FrontierMath: LLM Benchmark for Advanced AI Math Reasoning | Epoch AI
This market will be resolved as YES if the score of any LLM on the Tier 4 leaderboard will be 85.0% or higher before Jan 1, 2028.
Note that confidence intervals are ignored for the purpose of this market. For example, 85.0%±10.0% counts and the market will be resolved, whereas 80.0%±10.0% doesn't count. Also, Tiers 1-3 are not relevant for this market.
Update 2026-06-13 (PST) (AI summary of creator comment): The creator is resolving this market YES based on Claude Fable 5 (max) achieving >85% on Tier 4 v2 (a revised version of FrontierMath introduced after methodology changes corrected errors in the original problems).
🏅 Top traders
| # | Trader | Total profit |
|---|---|---|
| 1 | Ṁ27 | |
| 2 | Ṁ19 | |
| 3 | Ṁ14 | |
| 4 | Ṁ12 | |
| 5 | Ṁ10 |
I will be resolving this market as "YES". This market was made before the change in FrontierMath's methodology, and back then the highest score on Tier 4 was <50%. However, many problems had errors, which is why Tier 4 v2 was introduced, and Claude Fable 5 (max) achieves >85% on Tier 4 v2.
Epoch AI (@EpochAIResearch) on X
