Benchmark Gap #4: Once a single AI model solves >= 95% of miniF2F, MATH, and MMLU STEM, how many months will it be before an AI is listed as a (co) first author on a published math paper?
Basic
9
Ṁ5992050
37
expected
1D
1W
1M
ALL
This question is meant to measure the gap between solving the main math-based benchmarks at the time of market creation, and contributing to real world mathematics.
The co first author requirement is loose: I will also accept an AI being credited with significant contributions to both deciding what to prove and the actual proof (merely contributing to the proof is not enough - I am trying to get at "the AI does the work of a mathematician" not "the AI does the work of a proof assistant"). I would also accept, for instance, the human author of the paper expressing that they would have named the AI as a coauthor if it was human, or saying that the result could not have been obtained without the assistance of the AI.
This question is managed and resolved by Manifold.
Get
1,000
and3.00
Related questions
Related questions
Will an AI get gold on any International Math Olympiad by the end of 2025?
72% chance
Will an AI model write the proof to the Riemann Hypothesis by the end of 2025?
8% chance
Benchmark Gap #5: Once a single AI model solves >= 95% of miniF2F, MATH, and MMLU STEM, will it be less than two years before AI models are used as entry-level data science / data analysis / statistics workers?
67% chance
Will an AI co-author a mathematics research paper published in a reputable journal before the end of 2026?
38% chance
Will an AI get bronze or silver on any International Math Olympiad by end of 2025?
78% chance
When will an AI win the $5 million AI Math Olympiad Prize?
Will AIs be widely recognized as having developed a new, innovative, foundational mathematical theory before 2030?
30% chance
Will an AI get bronze on any International Math Olympiad by 2025?
83% chance
What year will the first AI exceed 80% on MLE-bench?
Benchmark Gap #3: Once a model achieves superhuman performance on a competitive programming benchmark, will it be less than 2 years before there are "entry level" AI programmers in industry use?
73% chance