ARC-AGI-2 Top Score >=50% in 2025?
36
1kṀ20k
Dec 31
33%
chance

https://arcprize.org/

This market resolves according to the score of the submission that receives the Top Score Prize ($75k) in 2025.

https://arcprize.org/blog/announcing-arc-agi-2-and-arc-prize-2025

See also:

Get
Ṁ1,000
to start trading!
Sort by:

darn these seem harder than arcagi1. seems weird that he says it's a similar success rate (60%) if it takes 20x as long to get that success rate or something

@Bayesian Chollet says most humans can still solve them in under five minutes and at most two attempts. I've done several of the ones in the public set myself, and my score is above 90% by that standard.

But they kept some of the puzzles from ARC AGI 1 and eliminated the ones that had been easily solved, so what's left is mostly much harder for AIs.

opened a Ṁ5,000 NO at 50% order

Big limit orders up for any takers!

@bens @Bayesian 5k more at 50% if you want it

Some more details on what "human performance" for these problems looks like:

https://x.com/fchollet/status/1904273411897168198

I tried several problems from the harder public set, and I was able to solve them all in five minutes. So I think they're mostly straightforward, though some are a bit tedious to do by hand.

opened a Ṁ1,000 NO at 60% order

Just clarifying; the title and description I'm not sure match (although I bet NO on the interpretation I assumed at first which was more lenient, so I'm not unhappy about this)

@bens didn't finish my thought:

The prize-winning entry I think has to be open-source and only spend $50 in compute? Whereas there may be top models that ARC reports as doing well (O4 or whatever) that might score much higher.

@bens the Top Score Prize seems to be given to the top submission that is opensource and spends only $50 in compute so I think they match? I could be wrong lemme see

@Bayesian Yes, the compute limits on Kaggle are pretty strict: https://arcprize.org/competition

@Bayesian ok so this market resolves to whether the score meeting those criteria (open-source and $50) reaches 50%, not whether ANY AI lab can get 50% under ANY conditions?

@bens If I understand right that the Top Score Prize is given to the best opensource model under the $50 compute requirement, then yes that's correct. If I misunderstand that and the Top Score Prize is given to some model under a different set of requirements then that would not be correct possibly

@Bayesian got it!

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules