Will the ARC-AGI grand prize be claimed by an LLM?

Ṁ1kṀ4.4k

2031

38%

chance

ALL

See this page for information about the competition: https://lab42.global/arcathon/. See also this podcast for an interview with Francois Chollet about the challenge and his predictions: https://www.dwarkeshpatel.com/p/francois-chollet

The fundamental characteristics of an "LLM" for the purposes of this question:

Sequence-to-sequence type model. (State-space and transformer models would both count, for example.)
No substantial post-hoc computation (like tree search). Sampling as it is practiced now is allowed. Prompting as it is practiced now is allowed.
I will use my best judgement if it’s ambiguous. The main point is that the model should be in the class of models that LLM-naysayers (Chollet especially) refer to when they assert that LLMs cannot solve ARC narrowly and are off-pathway for AGI generally.

People are also trading

Will the ARC-AGI grand prize be claimed by end of 2026?

65% chance

Will the ARC AGI Grand Prize be claimed before January 2027?

62% chance

In what year will the ARC AGI Grand Prize be claimed?

2028

Will the first AGI be an LLM that emulates Nobel-prize-worthy scientific research?

26% chance

Will the ARC AGI Grand Prize be claimed before 2030?

81% chance

Are LLMs capable of reaching AGI?

50% chance

Will OpenAI announce AGI before 2028 conditional on it centrally being an LLM?

48% chance

Will the most interesting AI in 2027 be a LLM?

79% chance

In 2028, will LLMs still be able to get Gary Marcus to make egregious errors?

Sort by:

Goalposts moving!

https://x.com/fchollet/status/1865865271728390515

@Tossup Will this resolve YES if the LLM-system is not used by a tree search algorithm from the outside (i.e. Tree of Thoughts), but something like tree search was still used in its training/fine-tuning regime, as some people speculated about e.g. Q*? I.e. the result is still an LLM that gets inferred in the regular way as current LLMs do, but the training/fine-tuning might be a bit/lot more advanced.

Said another way: if training is advanced and inference is simple for a system that wins the prize, will this still resolve YES?

Yes, I'm okay with novel "advanced" training techniques. Only the inference needs to be "standard" for LLMs. I think it would be too hard to determine if a training technique is too "advanced" given how little is public about frontier LLM training.

bought Ṁ50 NO

How does this resolve if no one gets the grand prize?

This resolves when the grand prize is awarded or the competition is shut down. To be clear, the market does not necessarily resolve NO if the grand prize is unclaimed in the 2024 round of the competition.

By "single forward pass," do you just mean it can't do any chain-of-thought before beginning its answer? I would expect that there would be one forward pass per pixel produced in the LLM's response.

Good point. I want to express something like “it’s a feed forward network or can be unrolled into a feed forward network (as for some SSMs)”, but I can’t think of a precise statement. I will remove this criterion.