Are LLMs capable of reaching AGI?
96
1.2kṀ19k
2100
75%
chance

This resolves YES if there exists an architecture that would unambiguously count as both an LLM and AGI, and could be trained and run on all the world's computing power combined as of market creation.

This market resolves after there's a broad consensus as to the correct answer, which likely won't be until after AGI has been reached and humanity has a much better conceptual understanding of what intelligence is and how it works. In the event of disagreements over what constitutes an LLM or AGI, I'll defer to a vote among Manifold users.

(In order to count as an AGI, it needs to be usefully intelligent. If it would take 1000 years to answer a question, that doesn't count.)

(Note that there are two forms of non-predictive bias at play here. If your P(doom) is high, you'll value mana lower in worlds where LLMs can reach AGI, since we're more likely to die in those worlds than if we don't obtain AGI until much later. But if your P(doom) is low, this market probably resolves sooner if the answer is YES, so due to your discount rate there's a bias towards betting on YES.)

Get
Ṁ1,000
to start trading!
Sort by:

This effectively cannot resolve to no and will just resolve to yes as soon as AGI exists whether that be in 5 years or 50 lol

EDIT: my point is about the wording logic of the market not about LLMs or AGI

@FriendlyMerc this question should not be resolved as soon as AGI exists but no sooner than it's architecture is know.

Are OpenAI's nee agents purely an LLM? I don't know their architecture, but I highly doubt they are "unambigously an LLM".

Surely an LLM is an important part. But LLMs - I guess - will just be one important part among many. (A car is not unambiguously a motor with seats either. Without a gearbox, you the car will not be faster than a bike)

@bbb Large language models pretraining on language is the primary way they reach the level of intelligence (and learning capability) that they already do in practice. However, the question is, in the limit, can LLMs reach AGI-level, with the required efficiency? This is a matter of understanding universality (oversimplifying a bit here for brevity) and complexity classes. Current LLMs are constant-time token predictors, so it's extremely unlikely that inference efficiency will be a problem on the order of causing this to resolve NO (there are also alternative samplers with different levels of efficiency which I won't discuss). As for universality, LLMs can learn to use tools in the same way humans do, applying their general intelligence to solve problems. LLMs learn to search over spaces of circuits which are able to solve problems they haven't seen before; it's a general problem-solving method, and successes and failures can be used to inform future searches.

If you're familiar with the rate of improvement of LLM coding abilities (i.e. program search), you'll appreciate how much territory search abilities can cover already in practice, and that the limits of those abilities are far from saturated. Just taking the rate of improvement of coding abilities of Claude Sonnet 3.5 (old), Sonnet 3.5 (new) and then Sonnet 3.7, there was major ability improvement between each release, and each release was only 4 months after its predecessor. For those requiring direct evidence of trajectories, that should be a significant update.

There are many papers demonstrating that transformers can be improved in many ways, without needing to train on other modalities, but I'm making trades based on my own independent awareness of what's feasible. While this market is not really an open question from my perspective, you're right that the question would not likely resolve as soon as AGI exists, because it's unlikely that the first lab to produce a broadly general model will detail its architecture and the training methods used. I'm betting at the rate I am to signal that I have knowledge of the way the market would resolve, independent of valuing the payoff.

Emmett Shear: "It has been increasingly obvious that "just scale up transformers bigger" is not going to lead to human level general intelligence. [...]"

https://x.com/eshear/status/1858660987530023148?t=0t5lYNS07G1Txp1uGa_E2Q&s=19

[double post. can be deleted]

"Ilya Sutskever, co-founder of AI labs Safe Superintelligence (SSI) and OpenAI, told Reuters recently that results from scaling up pre-training - the phase of training an AI model that use s a vast amount of unlabeled data to understand language patterns and structures - have plateaued."

from https://www.reuters.com/technology/artificial-intelligence/openai-rivals-seek-new-path-smarter-ai-current-methods-hit-limitations-2024-11-11/

@bbb he said pre-training has plateaued, not LLMs.

could be trained and run on all the world's computing power combined as of market creation

Given arbitrary training data?

@MartinRandall imo giving it training data like: "these are the thousand shortest aays to create an AGI" would not make the LLM itself an AGI.

What hypothetical data do you have in mind?

bought Ṁ100 YES

does this count LMM like GPT-4o as LLMs?

i.e. is the question more: are autoregressive transformers capable of reaching agi? or is the transformers architecture capable of reaching agi? (including things like SoRA)

This would probably never resolve to no.

How does this question resolve if the architecture uses LLMs as the cruicial subcomponent behind it's intelligence, but nonetheless it's overall architecture isn't an LLM. Specifically I'm thinking of agentic systems like AutoGPT, which have a state-machine architecture with explicitly coded elements like short-term and long-term memory, but use LLMs to form (natural language) plans and decide on what state transitions should be made. If these systems become AGI when LLMs are scaled up, how does the question resolve.

What counts as AGI here? Is it sufficient for it to do all text-based tasks as well as the average human?

hmm, what if I implement a dovetail by tweaking the weights of a transformer architecture and clocking it with a loop? then it implements all programs simultaneously, including AGIs.

@Mira Each sub-program may be an LLM, but I think you'd be hard-pressed to say that the overarching one is. Also, it would be too slow to qualify as an AGI. Same problem faced by the computable variations of AIXI.

@IsaacKing Oh no, I meant a single model frozen and unchanging during the whole process, which when clocked implements a universal dovetail. So there would be a only one program.

But it would take more than 1000 years to destroy humanity, so your update wouldn't count it...

predictedYES

@Mira Oh, I see. Yeah that's not what I had in mind, so I've edited the description to fix that.

@IsaacKing Also Mira's proposal would not work in the real world, not even after 1000 years. The machinery / memory / whatever would fail long before anything intelligent happened.

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules