Benchmark Gap #4: Once a single AI model solves >= 95% of miniF2F, MATH, and MMLU STEM, how many months will it be before an AI is listed as a (co) first author on a published math paper?
Basic
9
Ṁ5992050
37
expected
1D
1W
1M
ALL
This question is meant to measure the gap between solving the main math-based benchmarks at the time of market creation, and contributing to real world mathematics.
The co first author requirement is loose: I will also accept an AI being credited with significant contributions to both deciding what to prove and the actual proof (merely contributing to the proof is not enough - I am trying to get at "the AI does the work of a mathematician" not "the AI does the work of a proof assistant"). I would also accept, for instance, the human author of the paper expressing that they would have named the AI as a coauthor if it was human, or saying that the result could not have been obtained without the assistance of the AI.
This question is managed and resolved by Manifold.
Get
1,000
and3.00
Related questions
Related questions
Will an AI achieve >85% performance on the FrontierMath benchmark before 2028?
70% chance
When will an AI win the $5 million AI Math Olympiad Prize?
Which AI company first solves FrontierMath 85%?
Benchmark Gap #5: Once a single AI model solves >= 95% of miniF2F, MATH, and MMLU STEM, will it be less than two years before AI models are used as entry-level data science / data analysis / statistics workers?
67% chance
Will an AI achieve >85% performance on the FrontierMath benchmark before 2027?
62% chance
Will an AI co-author a mathematics research paper published in a reputable journal before the end of 2026?
42% chance
Will any AI model score >80% on Epoch's Frontier Math Benchmark in 2025?
43% chance
Will OpenAI Release a Model Capable of Reliably performing Gradeschool Math from Reasoning by Jan 1, 2025?
79% chance
Will I get a first paper author in a top ML conference in 2024?
42% chance
Will AIs be widely recognized as having developed a new, innovative, foundational mathematical theory before 2030?
32% chance