Will LLMs such as GPT-4 be seen as at most just a part of the solution to AGI? (Gary Marcus GPT-4 prediction #7)

1kṀ606

2044

91%

chance

ALL

This market is about prediction #7 from Gary Marcus's predictions for GPT-4. It resolves based on my interpretation of whether that prediction has been met, strongly taking into account arguments from other traders in this market. The full prediction is:

When AGI (artificial intelligence) comes, large language models like GPT-4 may be seen in hindsight as part of the eventual solution, but only as part of the solution. “Scaling” alone—building bigger and models until they absorb the entire internet — will prove useful, but only to a point. Trustworthy, general artificial intelligence, aligned with human values, will come, when it does, from systems that are more structured, with more built-in knowledge, and will incorporate at least some degree of explicit tools for reasoning and planning, as well as explicit it knowledge, that are lacking in systems like GPT. Within a decade, maybe much less, the focus of AI will move from a pure focus on scaling large language models to a focus on integrating them with a wide range of other techniques. In retrospectives written in 2043, intellectual historians will conclude that there was an initial overemphasis on large language models, and a gradual but critical shift of the pendulum back to more structured systems with deeper comprehension.

Unlike the first 6 predictions, this one only resolves once we get AGI, not shortly after GPT-4 releases.

GPT-4 speculation

Gary Marcus GPT-4 predictions

Get

1,000

to start trading!

People are also trading

Will GPT-6 be considered to be AGI?

17% chance

Will LLMs such as GPT4 be considered a solution to Moravec’s paradox by 2030?

20% chance

Will xAI develop a more capable LLM than GPT-5 before 2026

68% chance

Will GPT-5 make Manifold think very near-term AGI is more likely?

9% chance

Could GPT-4 recursively self-improve to AGI with the right cognitive architecture? [@Altimor, twitter]

Sort by:

Gary Marcus is a fool that should have been booed out of any public discourse years ago.

“systems that are more structured, with more built-in knowledge, and will incorporate at least some degree of explicit tools for reasoning and planning,” = retard tier

Structural systems will never matter.

That said, Clearly multimodal systems and online learning and explicit memory (all three of which have been added to other transformers) are the future and so this is still YES

This resolves no if LLMs are not part of the solution for AGI at all?

@MartinRandall No, that would resolve to yes. I've fixed the title to say at most a part of the solution.

@IsaacKing Thanks. Seems like an easy yes then. Even if LLMs take us to super-intelligence, which would be terrifying, there'd need to be some prompting systems to let it perform arbitrary intelligent tasks.

@MartinRandall What do you mean by a prompting system?

predictedYES

@IsaacKing I mean something that turns the problem (eg, playing optimal chess) into a token prediction problem (eg, predicting the next chess move in some notation given a board position and the context that white is being played by a super-intelligence)

@MartinRandall Such a prompting system leading to AGI seems clearly still in the LLM paradigm to me.

predictedYES

@IsaacKing I would expect a prompting system like that to be at least a non-general intelligence, it's a pretty complex task, humans aren't very good at it yet.

@MartinRandall I must be misunderstanding what you mean. Providing prompts to LLMs is how they're used at all. Getting them to play chess is extremely easy, you just interpret their output as chess notation.

predictedYES

@IsaacKing In the linked example, a general intelligence (Shawn Presser) took GPT-2 and fine-tuned it on a corpus of 2.4M chess games and added some code to remove invalid moves and interpreted the output as chess moves. They were able to mostly solve the problem of playing probable chess, based on the corpus. That also doesn't solve the problem of playing optimal chess - the resulting system is not trying to play the best move, just the most likely move.

I'm imagining that if we have a super-intelligent LLM, it won't solve general intelligent tasks unless it is fine-tuned and prompted correctly. The task of fine-tuning and prompting such an LLM will itself be an intelligent task. If we combine the LLM with an intelligent prompt engineer (human or artificial) the resulting system will be an AGI, but the LLM itself won't, because it isn't general by itself.

Possible counter-arguments:

Humans also won't solve arbitrary intelligent tasks without correct prompts. Motivating humans to do things requires intelligence.
Maybe it is easier to fine-tune and prompt more intelligent LLMs. In the limit, fine-tuning and prompting could be so simple and automated that it's clear that the general intelligence is in the LLM, not in the system.
Maybe we can make an intelligent prompt engineer by fine-tuning and prompting an LLM. Then the system is LLM + LLM.

@MartinRandall I think the assumption is that an AGI LLM would not need fine tuning, it would already be suited to any task by virtue of being a general intelligence. And the prompts would just be normal human language, the same way you'd ask another human for something.

predictedYES

@IsaacKing Taking again the example of playing optimal chess, only replace chess with a new board game that was invented after the super-intelligent LLM was trained. An LLM uses some amount of compute (C1) per token generated/predicted. Solving the new board game requires some fixed amount of compute (C2). If C2>C1 then I can't just prompt the LLM with the board game rules and get out optimal play starting with the first move.

Possible counter-argument: the LLM output would be something like "Oh, this is a complicated game, let's think through this step by step. <several pages of text omitted> Ok, then I think I have a winning strategy. First move: Axe to 4G".

@MartinRandall Is it a safe assumption that the LLM must use the same amount of compute per token regardless of the input?

But regardless, I think I now see the difficulty: the LLM can't learn; it can only react to the information given in the prompt and nothing else. Any new developments in the world would have to be supplied to it in the prompt if the user wants it to take those things into account.

I still don't think that necessarily makes this question degenerate. If it accepts arbitrary prompt lengths, then you could just append all of wikipedia to the beginning of each prompt, no fancy engineering required.

predictedYES

@IsaacKing That convinces me. Thanks for talking it through. :)

People are also trading

Will GPT-6 be considered to be AGI?

17% chance

Will LLMs such as GPT4 be considered a solution to Moravec’s paradox by 2030?

20% chance

Will xAI develop a more capable LLM than GPT-5 before 2026

68% chance

Will GPT-5 make Manifold think very near-term AGI is more likely?

9% chance

Could GPT-4 recursively self-improve to AGI with the right cognitive architecture? [@Altimor, twitter]

9% chance

People are also trading

People are also trading

Related questions