Benchmark Gap #4: Once a single AI model solves >= 95% of miniF2F, MATH, and MMLU STEM, how many months will it be before an AI is listed as a (co) first author on a published math paper?
9
197
Ṁ599Ṁ410
2050
37
expected
1D
1W
1M
ALL
This question is meant to measure the gap between solving the main math-based benchmarks at the time of market creation, and contributing to real world mathematics.
The co first author requirement is loose: I will also accept an AI being credited with significant contributions to both deciding what to prove and the actual proof (merely contributing to the proof is not enough - I am trying to get at "the AI does the work of a mathematician" not "the AI does the work of a proof assistant"). I would also accept, for instance, the human author of the paper expressing that they would have named the AI as a coauthor if it was human, or saying that the result could not have been obtained without the assistance of the AI.
Get Ṁ200 play money
Related questions
Will an AI win a Gold Medal on the International Math Olympiad by 2027?
45% chance
When will an AI win the $5 million AI Math Olympiad Prize?
Will AIs be widely recognized as having developed a new, innovative, foundational mathematical theory before 2030?
42% chance
Will an AI win a Gold Medal on the International Math Olympiad by 2029?
70% chance
Will an AI get bronze or silver on any International Math Olympiad by end of 2025?
29% chance
Will an AI co-author a mathematics research paper published in a reputable journal before the end of 2026?
59% chance
Will an AI win a Gold Medal on the International Math Olympiad by 2032?
75% chance
Will we have an AI generated research paper accepted to > 1 top ML conference by 2026?
46% chance
In 2029, will any AI be able to take an arbitrary proof in the mathematical literature and translate it into a form suitable for symbolic verification? (Gary Marcus benchmark #5)
39% chance
Will an AI be able to convert recent mathematical results into a fully formal proofs that can be verified by a mainstream proof assistant by 2025?
23% chance