What will be the best score (5/5 reliability) on ZeroBench by December 31st 2025?
4
1.7kṀ4392026
45%
0 - 10
19%
11 - 20
5%
21 - 30
4%
31 - 40
4%
41 - 50
4%
51 - 60
4%
61 - 70
4%
71 - 80
4%
81 - 90
5%
91 - 100
This market will use the variant of the benchmark frozen one week after the initial release (following the public benchmark red-teaming stage to identify flawed/ambiguous questions).
The temperature used for the 5/5 reliability evaluation will be the default setting provided by each LLM API provider. In cases where this default is ambiguous to determine, we will default to a temperature of 0.7.
This question is managed and resolved by Manifold.
Get
1,000 to start trading!
Sort by:
As of May 24th 2025, Claude 4 Opus is the new SotA:
https://x.com/JRobertsAI/status/1926325748303872203
4% Pass@1
As of March 28th 2025, Gemini 2.5 Pro is the new SotA: https://x.com/JRobertsAI/status/1905577784300183653
3% pass@1
5% pass@5
1% 5/5 reliability
People are also trading
Related questions
What will be the best AI performance on Humanity's Last Exam by December 31st 2025?
Will an AI score over 80% on FrontierMath Benchmark in 2025
15% chance
What will be the best performance on FrontierMath by December 31st 2025?
Top Multi-SWE-bench score in 2025?
47.5
Will any model get above human level on the Simple Bench benchmark before September 1st, 2025.
44% chance
What will be the best score on Cybench by December 31st 2025?
What will be the best normalized score achieved on the original 7 RE-Bench tasks by December 31st 2025?
What will be the best performance on SWE-bench Verified by December 31st 2025?
What will be the best performance on OSWorld by December 31st 2025?
What will be the best performance on EnigmaEval by December 31st 2025?