80% on SWE-Bench Verified by Jan 1 2025
17
Ṁ3944
Jan 2
14%
chance

Someone will achieve 80% on SWE-Bench by Jan 1 2025. Current SoTA is ~20%. Must announce result by Jan 1.

Current SoTA now 30% Aug 12.

https://arxiv.org/pdf/2310.06770

market is now 80% on SWE Bench verified by EOY.

https://openai.com/index/introducing-swe-bench-verified/

Given the uncertainty in this market with respect to resolution criteria, I have sold all my shares and will merely judge it.

Get Ṁ1,000 play money
Sort by:
bought Ṁ50 NO

80% in March 2025

open ai system card for gpt-4o shows 20% on swe-bench but used open source scaffold?

bought Ṁ250 NO

If we are talking about SWE-Bench Full (not SWE-Light), this is impossible in the current state. There are a bunch of unsolvable tasks in the benchmark, stemming from the GitHub issues being ambiguously written or the unit tests failing due to bugs not related to the GitHub issue itself. Only the Lite leaderboard issues are properly vetted.

we are talking about swe-bench full not swe-bench lite; i’ll update with a resolution strategy in the event that the benchmark itself is found to be faulty in some way

(i am aware of the recent paper out of Chris Re’s group that discussed this)

https://openai.com/index/introducing-swe-bench-verified/

Swe bench Verified was announced today

market updated!!