Benchmark Gap #8: Once a single AI gets >= 80% on FrontierMath Tier 4, how long until an AI publishes a math paper?

300Ṁ197

2040

26 months

expected

ALL

59%

0 - 11

12%

12 - 23

24 - 35

36 - 47

48 - 59

60 - 71

72 - 83

84 - 95

96 - 107

108 - 119

120+

This question is meant to measure the gap between solving the main math-based benchmarks at the time of market creation, and contributing to real world mathematics.

FrontierMath Tier-4 is an even harder version of FrontierMath - do we need something even harder to fully close the benchmark gap?

I will accept the AI being a (co) first author, or an AI being credited with significant contributions to both deciding what to prove and the actual proof (merely contributing to the proof is not enough - I am trying to get at "the AI does the work of a mathematician" not "the AI does the work of a proof assistant"). I would also accept, for instance, the human author of the paper expressing that they would have named the AI as a co first author if it was human, or saying that the result could not have been obtained without the assistance of the AI.

If a model publishes a paper before it achieves this score, I'll resolve to the 0 bucket.

Update 2025-07-16 (PST) (AI summary of creator comment): In response to user feedback, the creator has acknowledged that the resolution criterion "or saying that the result could not have been obtained without the assistance of the AI" may be interpreted differently than its literal meaning implies.

Technology

Technical AI Timelines

Math

AI Benchmarks

Get

1,000

to start trading!

People are also trading

Will an AI score over 80% on FrontierMath Benchmark in 2025

9% chance

Top FrontierMath score in 2025?

56.5

Will an AI achieve >85% performance on the FrontierMath benchmark before 2027?

45% chance

Will an AI achieve >85% performance on the FrontierMath benchmark before 2028?

60% chance

Will an AI achieve >80% performance on the FrontierMath benchmark before 2027?

62% chance

Will any AI model achieve > 40% on Frontier Math before 2026?

75% chance

What will be the best performance on FrontierMath Tier 4 by December 31st 2025?

In what year will AI achieve a score of 85% or higher on the SimpleBench leaderboard?

1/4/32

Will any AI model score >80% on Epoch's Frontier Math Benchmark in 2025?

10% chance

Will a Chinese-made AI beat o3's December score on Frontier Math by the end of 2025?

Sort by:

Hasn't an AI already been credited as an author (blocked by journal policy)? I get what you're trying to do here but feels like it needs a bit of work.

@WilliamGunn I'm not aware of any math paper where an AI has been a first author - do you have a link?

@vluzko Not first author but there have been authors that wanted to make AI a coauthor I think.

If you have a specific example in mind I can look at it and decide if I need to update the description, but I'm pretty sure no AI has actually been a first author or anything like it on a math paper.

@vluzko If what you care about is that an AI is first author, perhaps remove "... or saying that the result could not have been obtained without the assistance of the AI."

It's your market, do whatever you want, but I predict that phrase causing trouble.