Will an autonomous agent resolve 90% of tasks on SWE-bench by 2025?
14
220Ṁ2546resolved Jan 1
Resolved
NO1H
6H
1D
1W
1M
ALL
Resolves "Yes" if, at time of closure, there is an entry on the SWE-bench leaderboard (https://www.swebench.com/) with score greater or equal to 90%.
Linked Questions:
This question is managed and resolved by Manifold.
Get
1,000 to start trading!
🏅 Top traders
# | Name | Total profit |
---|---|---|
1 | Ṁ54 | |
2 | Ṁ32 | |
3 | Ṁ21 | |
4 | Ṁ16 | |
5 | Ṁ9 |
Sort by:
@DavidFWatson That's an excellent question. Let's explore possibilities:
This could be included in the question, i.e. what matters is only the number on the benchmark, regardless of whether it was gamed
I could wait a certain amount of time to check if no controversy emerges. Feels like one month would be safe. The question then resolves yes if one month after the deadline, I judge that there is no consensus that the number was gamed. This makes the question more informative.
People are also trading
Related questions
By 2026 will there be autonomous AI good enough that I use it?
37% chance
By what factor will the cost for SotA SWE-agents drop from 2024 to 2025?
Will an autonomous agent resolve 90% of tasks on SWE-bench by 2026?
63% chance
AI resolves at least X% on SWE-bench without any assistance, by 2028?
Will a smart agent pass our Turing test by the end of 2025?
58% chance
AI resolves at least X% on SWE-bench WITH assistance, by 2028?
Will an autonomous personal AI agent, capable of managing daily affairs, be available by the end of 2024?
12% chance
Will any AI solve more than four of AI 2027 Marcus-Brundage tasks in 2025?
28% chance
Will an AI achieve >85% performance on the FrontierMath benchmark before 2028?
60% chance
Will an autonomous agent resolve 90% of tasks on SWE-bench by 2027?
72% chance