Top Multi-SWE-bench score in 2025?
Top Multi-SWE-bench score in 2025?
19
10kṀ28kDec 31
46.6 %
expected1H
6H
1D
1W
1M
ALL
3%
0 - 19%
40%
20 - 39%
36%
40 - 59%
14%
60 - 79%
7%
80 - 100%
SWE-bench is a great AI benchmark, but it is Python-only. Multi-SWE-bench is the same thing with multiple programming languages: C, C++, Java, JavaScript, TypeScript, Go, Rust.
Claude 3.7 Sonnet based agent achieved a score of 19% in 2025-03-29, which is currently the best score. The score will be rounded. ("Rounding half up" to be exact, see Rounding.)
The resolution will be primarily from the official leaderboard, but other announcements from reputable organizations will be considered.
See also /SG/top-swebench-verified-score-in-2025
This question is managed and resolved by Manifold.
Get
1,000 to start trading!
What is this?
What is Manifold?
Manifold is the world's largest social prediction market.
Get accurate real-time odds on politics, tech, sports, and more.
Or create your own play-money betting market on any question you care about.
Are our predictions accurate?
Yes! Manifold is very well calibrated, with forecasts on average within 4 percentage points of the true probability. Our probabilities are created by users buying and selling shares of a market.
In the 2022 US midterm elections, we outperformed all other prediction market platforms and were in line with FiveThirtyEight’s performance. Many people who don't like betting still use Manifold to get reliable news.
Why use play money?
Mana (Ṁ) is the play-money currency used to bet on Manifold. It cannot be converted to cash. All users start with Ṁ1,000 for free.
Play money means it's much easier for anyone anywhere in the world to get started and try out forecasting without any risk. It also means there's more freedom to create and bet on any type of question.
People are also trading
What is this?
What is Manifold?
Manifold is the world's largest social prediction market.
Get accurate real-time odds on politics, tech, sports, and more.
Or create your own play-money betting market on any question you care about.
Are our predictions accurate?
Yes! Manifold is very well calibrated, with forecasts on average within 4 percentage points of the true probability. Our probabilities are created by users buying and selling shares of a market.
In the 2022 US midterm elections, we outperformed all other prediction market platforms and were in line with FiveThirtyEight’s performance. Many people who don't like betting still use Manifold to get reliable news.
Why use play money?
Mana (Ṁ) is the play-money currency used to bet on Manifold. It cannot be converted to cash. All users start with Ṁ1,000 for free.
Play money means it's much easier for anyone anywhere in the world to get started and try out forecasting without any risk. It also means there's more freedom to create and bet on any type of question.
Related questions
Top SWE-Bench Verified score in 2025?
85.4
What will be the highest score achieved on SWE-Bench Verified in 2025?
What will be the best performance on SWE-bench Verified by December 31st 2025?
When will SWE-bench be solved?
AI resolves at least X% on SWE-bench WITH assistance, by 2028?
AI resolves at least X% on SWE-bench without any assistance, by 2028?
What will be the best score on Cybench by December 31st 2025?
Will any model get above human level on the Simple Bench benchmark before September 1st, 2025.
42% chance
What will be the best score (5/5 reliability) on ZeroBench by December 31st 2025?
What will be the best normalized score achieved on the original 7 RE-Bench tasks by December 31st 2025?