Will an AI agent system be able to score at least 40% on level 3 tasks in the GAIA benchmark before 2025.
22
1kṀ4135
resolved Jan 1
Resolved
YES

GAIA: a benchmark for General AI Assistants was introduced in Nov. 2023. It contains task for AI agents that test there ability.

An example task is:
"Assuming scientists in the famous youtube video The Thinking Machine (Artificial Intelligence in the 1960s) were interviewed the same year, what is the name of the scientist predicting the sooner thinking machines or robots? Answer using the format First name Last name"

Currently the strongest AI system like GPT4 with plugins or AutoGPT failed to solve any of the level 3 task. This market will resolve to "yes" as soon as an AI/Agent system scores >=40% on level 3 tasks.

Get
Ṁ1,000
to start trading!

🏅 Top traders

#NameTotal profit
1Ṁ437
2Ṁ333
3Ṁ46
4Ṁ43
5Ṁ25
Sort by:

Currently highest scoring agent does 18.75 on level 3. So we are half way there with a couple of month to go...

predictedYES

Important new submission. Not 100% clear yet wether legit or not.
But "Friday" claims to reach already 45% on level 1 and 6% on level 3
https://twitter.com/mialon_gregoire/status/1750110058090782906

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules