Will >50% of the tasks in the WebArena benchmark be solved by EOY 2024?
15
1kṀ2350resolved Dec 18
Resolved
YES1H
6H
1D
1W
1M
ALL
In this tweet (https://twitter.com/ajeya_cotra/status/1684358475416064001?s=20), Ajeya Cotra (admirably) predicted that there's >50% chance >50% of the tasks in the newly announced WebArena benchmark will be solved by a single agent. Note that Ajeya didn't specify that a single agent had to solve all of them but I will resolve based on that, so there is the possibility of divergence.
This question is managed and resolved by Manifold.
Get
1,000 to start trading!
🏅 Top traders
# | Name | Total profit |
---|---|---|
1 | Ṁ481 | |
2 | Ṁ25 | |
3 | Ṁ9 | |
4 | Ṁ8 | |
5 | Ṁ7 |
People are also trading
Related questions
Will an AI achieve >85% performance on the FrontierMath benchmark before 2028?
61% chance
Will an AI achieve >85% performance on the FrontierMath benchmark before 2027?
58% chance
Will an AI model surpasses o3's matharena.ai 88% Overall score by July 1, 2025?
15% chance
Will an AI System Solve One of the Remaining Millennium Prize Problems by June 2025?
1% chance
Will an autonomous agent resolve 90% of tasks on SWE-bench by 2026?
50% chance
Will an AI score over 80% on FrontierMath Benchmark in 2025
10% chance
Will an autonomous agent resolve 90% of tasks on SWE-bench by 2027?
69% chance
Will an LLM agent complete >50% of the lab tasks on the Factorio Learning Environment benchmark in 2025?
30% chance
Will any AI solve more than four of AI 2027 Marcus-Brundage tasks in 2025?
28% chance
Will any AI model score >80% on Epoch's Frontier Math Benchmark in 2025?
10% chance