Will Gemini outperform GPT-4 at mathematical theorem-proving?
15
150
Ṁ299Ṁ310
2025
56%
chance
1D
1W
1M
ALL
Based on speculation from https://youtu.be/tkqD9W5U9F4?t=468
To operationalize this, this question will resolve based on the LeanDojo benchmark (https://leandojo.org/), in particular the Pass@1 metric, where "The prover is given only one attempt and must find the proof within a wall time limit of 10 minutes."
GPT-4 is reported to achieve an accuracy of 28.8% on the "random" split of the test data in Table 2 of the LeanDojo paper (https://arxiv.org/pdf/2306.15626.pdf).
This question closes when an evaluation of Gemini's performance on this task is brought to my attention.
Get Ṁ200 play money
Related questions
What will be true about GPT-4.5?
Will GPT-4 be trained (roughly) compute-optimally using the best-known scaling laws at the time?
30% chance
Will Gemini achieve a higher score on the SAT compared to GPT-4?
70% chance
Will Google Gemini perform better (text) than GPT-4?
34% chance
Will Google Gemini do as well as GPT-4 on Sparks of AGI tasks?
76% chance
Will Gemini achieve a score above 90% on the MATH benchmark?
46% chance
Will an open source model beat GPT-4 in 2024?
65% chance
Is GPT-4 (0613) more capable than GPT-4 (0314)?
71% chance
Will Google's Gemini beat GPT4 in terms of capabilities on release?
22% chance
Will "Gemini [Ultra, 1.0] smash GPT-4 by 5x"?
19% chance