ARC-AGI-2 Top Score >=50% in 2025?

1kṀ52k

Dec 31

chance

ALL

https://arcprize.org/

This market resolves according to the score of the submission that receives the Top Score Prize ($75k) in 2025.

https://arcprize.org/blog/announcing-arc-agi-2-and-arc-prize-2025

See also:

Technology

OpenAI

Technical AI Timelines

Competition Math

Get

1,000

to start trading!

People are also trading

Will any AI model score above 95% on GRAB by the end of 2025?

34% chance

Will an AI score over 80% on FrontierMath Benchmark in 2025

4% chance

Will a major lab officially declare AGI before the end of 2025?

6% chance

In what year will AI achieve a score of 95% or higher on the GPQA benchmark?

2/25/27

Top score on Humanity's Last Exam > 50% by 2028?

95% chance

In what year will AI achieve a score of 95% or higher on the GSO benchmark?

2/2/29

Top score on Humanity's Last Exam > 50% by 2027?

87% chance

Top score on Humanity's Last Exam > 50% by 2029?

87% chance

Will the ARC AGI Grand Prize be claimed before 2030?

Sort by:

would you look at that

@Bayesian The only news I see so far is this:

@TimothyJohnson5c16 what do commercial LLMs have to do with this market?

@Usaar33 I assumed Bayesian saw something with GPT-5. I don't see any major improvements on the Kaggle leaderboard either.

The linked market was trending above 50% in anticipation of gpt5. Then gpt5 underperformed and i got owned

darn these seem harder than arcagi1. seems weird that he says it's a similar success rate (60%) if it takes 20x as long to get that success rate or something

@Bayesian Chollet says most humans can still solve them in under five minutes and at most two attempts. I've done several of the ones in the public set myself, and my score is above 90% by that standard.

But they kept some of the puzzles from ARC AGI 1 and eliminated the ones that had been easily solved, so what's left is mostly much harder for AIs.

opened a Ṁ5,000 NO at 50% order

Big limit orders up for any takers!

@bens @Bayesian 5k more at 50% if you want it

Some more details on what "human performance" for these problems looks like:

https://x.com/fchollet/status/1904273411897168198

I tried several problems from the harder public set, and I was able to solve them all in five minutes. So I think they're mostly straightforward, though some are a bit tedious to do by hand.

opened a Ṁ1,000 NO at 60% order

Just clarifying; the title and description I'm not sure match (although I bet NO on the interpretation I assumed at first which was more lenient, so I'm not unhappy about this)

@bens didn't finish my thought:

The prize-winning entry I think has to be open-source and only spend $50 in compute? Whereas there may be top models that ARC reports as doing well (O4 or whatever) that might score much higher.

@bens the Top Score Prize seems to be given to the top submission that is opensource and spends only $50 in compute so I think they match? I could be wrong lemme see

@Bayesian Yes, the compute limits on Kaggle are pretty strict: https://arcprize.org/competition

@Bayesian ok so this market resolves to whether the score meeting those criteria (open-source and $50) reaches 50%, not whether ANY AI lab can get 50% under ANY conditions?

@bens If I understand right that the Top Score Prize is given to the best opensource model under the $50 compute requirement, then yes that's correct. If I misunderstand that and the Top Score Prize is given to some model under a different set of requirements then that would not be correct possibly