What will be the human baseline for the Abstraction and Reasoning Corpus (ARC-AGI)?

Ṁ1kṀ3.1k

resolved Sep 10

Resolved

90-100%

Resolved

75-90%

Resolved

YES

60-75%

Resolved

35-60%

Resolved

15-35%

Resolved

0-15%

There is not currently an established human baseline for François Chollet's Abstraction and Reasoning Corpus (ARC-AGI).

On the public training set humans solve 84% of the tasks.

It it known that the public training set is easier than the public evaluation set and the private evaluation set. The public and private evaluation sets are apparently the roughly the same level of difficulty.

For the first credible human baseline study, what fraction of evaluation set tasks will humans successfully solve?

Note that this can include tasks from either the public or private evaluation sets.

In the extremely unlikely case that the number would fit in two intervals, the lowest will be chosen.

Market context

Get

1,000

to start trading!

🏅 Top traders

#	Trader	Total profit
1		Ṁ240
2		Ṁ106
3		Ṁ106
4		Ṁ80
5		Ṁ35

People are also trading

Will any AI model score above 90% on the ARC-AGI-2 benchmark before April 2026?

13% chance

Will ARC AGI 3 be easy for humans and hard for AI?

61% chance

AI solves the Abstraction and Reasoning Corpus (ARC) by 2028

21% chance

[ACX 2026] What will be the highest score achieved on ARC-AGI-2 before 2027?

93.8

Will ARC's Heuristic Arguments research substantially advance AI alignment before 2027?

26% chance

By when will we have AGI?

Will we have strong AGI by 2030? (metaculus criteria)

36% chance

Who Will Be the First to Reveal Human-Level AGI?

In what year will we achieve AGI?

What organization will be the first to create AGI?

5 Comments

18 Holders

35 Trades

Sort by:

We have a new preprint estimating human performance on the full training and evaluation sets of ARC: https://arxiv.org/abs/2409.01374. The empirical average performance on all tasks from the training set is 76.2%, while for the evaluation set it is 64.2%.