Top Multi-SWE-bench score in 2025?
16
10kṀ26kDec 31
47.1 %
expected1H
6H
1D
1W
1M
ALL
3%
0 - 19%
39%
20 - 39%
36%
40 - 59%
15%
60 - 79%
7%
80 - 100%
SWE-bench is a great AI benchmark, but it is Python-only. Multi-SWE-bench is the same thing with multiple programming languages: C, C++, Java, JavaScript, TypeScript, Go, Rust.
Claude 3.7 Sonnet based agent achieved a score of 19% in 2025-03-29, which is currently the best score. The score will be rounded. ("Rounding half up" to be exact, see Rounding.)
The resolution will be primarily from the official leaderboard, but other announcements from reputable organizations will be considered.
See also /SG/top-swebench-verified-score-in-2025
This question is managed and resolved by Manifold.
Get
1,000 to start trading!
Sort by:
@ian The leaderboard on the website shows something with Gemini 2.5 Pro at 21.62%:
https://multi-swe-bench.github.io/#/
(Not sure what Mopenhands is...)
People are also trading
Related questions
Top SWE-Bench Verified score in 2025?
85.0
What will be the best score on Cybench by December 31st 2025?
What will be the best performance on SWE-bench Verified by December 31st 2025?
ARC-AGI-2 Top Score in 2025
37.0
What will be the highest score achieved on SWE-Bench Verified in 2025?
When will SWE-bench be solved?
AI resolves at least X% on SWE-bench WITH assistance, by 2028?
AI resolves at least X% on SWE-bench without any assistance, by 2028?
What will be the best score (5/5 reliability) on ZeroBench by December 31st 2025?
What will be the best normalized score achieved on the original 7 RE-Bench tasks by December 31st 2025?