Skip to main content
MANIFOLD
GPT-4 #2: Will GPT-4 be at least partially trained with RL?
33
Ṁ670Ṁ6.4k
resolved Mar 14
Resolved
YES

The obvious YES resolution is if some form of RLHF (https://arxiv.org/abs/1706.03741) is used, but others forms of RL would count.

The RL loop must actually directly affect the weights. If there's some RL for, say, architecture search or hyperparameter optimization as an outer loop that doesn't count.

Nov 25, 11:10pm: Will GPT-4 be at least partially trained with RL? → GPT-4 #2: Will GPT-4 be at least partially trained with RL?

Market context
Get
Ṁ1,000
to start trading!

🏅 Top traders

#TraderTotal profit
1Ṁ145
2Ṁ132
3Ṁ23
4Ṁ21
5Ṁ20
Sort by:

Resolving YES based on the content of the technical report: https://cdn.openai.com/papers/gpt-4.pdf

Training with human feedback

We incorporated more human feedback, including feedback submitted by ChatGPT users, to improve GPT-4’s behavior. We also worked with over 50 experts for early feedback in domains including AI safety and security.

https://openai.com/product/gpt-4

"GPT-4 will be multimodal":

https://www.heise.de/news/GPT-4-is-coming-next-week-and-it-will-be-multimodal-says-Microsoft-Germany-7540972.html

May not be reliable information, or might be poorly phrased, but makes this market a bit murky, so I've sold my shares.

To be clear, this would be breaking the trend of previous releases of GPT-X models which were purely pretrained for next-word-prediction