Will >50% of the tasks in the WebArena benchmark be solved by EOY 2024?

15

1kṀ2350

resolved Dec 18

Resolved

YES

1H

6H

1D

1W

1M

ALL

In this tweet (https://twitter.com/ajeya_cotra/status/1684358475416064001?s=20), Ajeya Cotra (admirably) predicted that there's >50% chance >50% of the tasks in the newly announced WebArena benchmark will be solved by a single agent. Note that Ajeya didn't specify that a single agent had to solve all of them but I will resolve based on that, so there is the possibility of divergence.

Technical AI Timelines

Get

1,000

to start trading!

🏅 Top traders

#	Name	Total profit
1		Ṁ481
2		Ṁ25
3		Ṁ9
4		Ṁ8
5		Ṁ7

People are also trading

Will an AI achieve >85% performance on the FrontierMath benchmark before 2028?

Will an autonomous agent resolve 90% of tasks on SWE-bench by 2026?

Will an AI score over 80% on FrontierMath Benchmark in 2025

Will an autonomous agent resolve 90% of tasks on SWE-bench by 2027?

Will an LLM agent complete >50% of the lab tasks on the Factorio Learning Environment benchmark in 2025?

Will any AI solve more than four of AI 2027 Marcus-Brundage tasks in 2025?

Will any AI model score >80% on Epoch's Frontier Math Benchmark in 2025?

Will an AI achieve >80% performance on the FrontierMath benchmark before 2027?

Will an AI achieve >85% performance on the FrontierMath benchmark before 2027?

Will an AI solve a Millennium problem by EOY 2027?

Related questions

Will an AI achieve >85% performance on the FrontierMath benchmark before 2028?

Will an autonomous agent resolve 90% of tasks on SWE-bench by 2026?

Will an AI score over 80% on FrontierMath Benchmark in 2025

Will an autonomous agent resolve 90% of tasks on SWE-bench by 2027?

Will an LLM agent complete >50% of the lab tasks on the Factorio Learning Environment benchmark in 2025?

Will any AI solve more than four of AI 2027 Marcus-Brundage tasks in 2025?

Will any AI model score >80% on Epoch's Frontier Math Benchmark in 2025?

Will an AI achieve >80% performance on the FrontierMath benchmark before 2027?

Will an AI achieve >85% performance on the FrontierMath benchmark before 2027?

Will an AI solve a Millennium problem by EOY 2027?

© Manifold Markets, Inc.•Terms•Privacy