Will any AI model score 80% on Epoch's Frontier Math Benchmark in 2025?

Question

Background

The FrontierMath benchmark, created by Epoch AI, is designed to test AI models' mathematical reasoning capabilities. As of December 2024, OpenAI's o3 reasoning model holds the current record with a score of 25.2%, while most other models score around 2% or less. This benchmark represents a significant challenge for current AI systems.

Resolution Criteria

This market will resolve YES if any AI model achieves a score greater than 80% on Epoch's FrontierMath benchmark at any point during the 2025 calendar year (January 1, 2025 - December 31, 2025). The score must be:

Officially reported or acknowledged by Epoch AI

Achieved on the full benchmark test, not a subset

Achieved in a single run without human assistance

The market will resolve NO if no AI model achieves a score above 80% during 2025

Manifold Markets · Accepted Answer

No — resolved on Feb 17, 2026 by Manifold Markets prediction market.

#	Trader	Total profit
1		Ṁ34
2		Ṁ32
3		Ṁ22
4		Ṁ15
5		Ṁ13

Background

Resolution Criteria

🏅 Top traders

People are also trading

Related questions