GPT-4 #2: Will GPT-4 be at least partially trained with RL?

33

Ṁ670Ṁ6.4k

resolved Mar 14

Resolved

YES

1H

6H

1D

1W

1M

ALL

The obvious YES resolution is if some form of RLHF (https://arxiv.org/abs/1706.03741) is used, but others forms of RL would count.

The RL loop must actually directly affect the weights. If there's some RL for, say, architecture search or hyperparameter optimization as an outer loop that doesn't count.

Nov 25, 11:10pm: ~~Will GPT-4 be at least partially trained with RL?~~ → GPT-4 #2: Will GPT-4 be at least partially trained with RL?

Market context

Technical AI Timelines

GPT-4 speculation

Get

1,000

to start trading!

🏅 Top traders

#	Trader	Total profit
1		Ṁ145
2		Ṁ132
3		Ṁ23
4		Ṁ21
5		Ṁ20

People are also trading

Will OpenAI's autonomous agent be based on GPT-4?

Will LLMs such as GPT-4 be seen as at most just a part of the solution to AGI? (Gary Marcus GPT-4 prediction #7)

Will it be possible to disentangle most of the features learned by a model comparable to GPT-4 this decade?

Was GPT-4 trained in 4 months or less?

Will GPT-4 escape?

Will GPT-5 have "the ability ... to autonomously replicate and acquire resources" per an ARC-like eval?

Could GPT-4 recursively self-improve to AGI with the right cognitive architecture? [@Altimor, twitter]

Sort by:

Resolving YES based on the content of the technical report: https://cdn.openai.com/papers/gpt-4.pdf

Training with human feedback
We incorporated more human feedback, including feedback submitted by ChatGPT users, to improve GPT-4’s behavior. We also worked with over 50 experts for early feedback in domains including AI safety and security.

https://openai.com/product/gpt-4

"GPT-4 will be multimodal":

https://www.heise.de/news/GPT-4-is-coming-next-week-and-it-will-be-multimodal-says-Microsoft-Germany-7540972.html

May not be reliable information, or might be poorly phrased, but makes this market a bit murky, so I've sold my shares.

To be clear, this would be breaking the trend of previous releases of GPT-X models which were purely pretrained for next-word-prediction

People are also trading

Will OpenAI's autonomous agent be based on GPT-4?

Will LLMs such as GPT-4 be seen as at most just a part of the solution to AGI? (Gary Marcus GPT-4 prediction #7)

Will it be possible to disentangle most of the features learned by a model comparable to GPT-4 this decade?

Was GPT-4 trained in 4 months or less?

Will GPT-4 escape?

Will GPT-5 have "the ability ... to autonomously replicate and acquire resources" per an ARC-like eval?

Could GPT-4 recursively self-improve to AGI with the right cognitive architecture? [@Altimor, twitter]

Related questions

Will OpenAI's autonomous agent be based on GPT-4?

Will LLMs such as GPT-4 be seen as at most just a part of the solution to AGI? (Gary Marcus GPT-4 prediction #7)

Will it be possible to disentangle most of the features learned by a model comparable to GPT-4 this decade?

Was GPT-4 trained in 4 months or less?

Will GPT-4 escape?

Will GPT-5 have "the ability ... to autonomously replicate and acquire resources" per an ARC-like eval?

Could GPT-4 recursively self-improve to AGI with the right cognitive architecture? [@Altimor, twitter]