GPT-4 #2: Will GPT-4 be at least partially trained with RL?
33
670Ṁ6365
resolved Mar 14
Resolved
YES

The obvious YES resolution is if some form of RLHF (https://arxiv.org/abs/1706.03741) is used, but others forms of RL would count.

The RL loop must actually directly affect the weights. If there's some RL for, say, architecture search or hyperparameter optimization as an outer loop that doesn't count.

Nov 25, 11:10pm: Will GPT-4 be at least partially trained with RL? → GPT-4 #2: Will GPT-4 be at least partially trained with RL?

Get
Ṁ1,000
to start trading!

🏅 Top traders

Sort by:

Resolving YES based on the content of the technical report: https://cdn.openai.com/papers/gpt-4.pdf

Training with human feedback

We incorporated more human feedback, including feedback submitted by ChatGPT users, to improve GPT-4’s behavior. We also worked with over 50 experts for early feedback in domains including AI safety and security.

https://openai.com/product/gpt-4

"GPT-4 will be multimodal":

https://www.heise.de/news/GPT-4-is-coming-next-week-and-it-will-be-multimodal-says-Microsoft-Germany-7540972.html

May not be reliable information, or might be poorly phrased, but makes this market a bit murky, so I've sold my shares.

To be clear, this would be breaking the trend of previous releases of GPT-X models which were purely pretrained for next-word-prediction

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules