Will DPO variants mostly replace RLHF before EOY 2024?
13
710Ṁ511Dec 31
34%
chance
1D
1W
1M
ALL
Resolves YES if ≥3 out of the 5 top LLMs on Chabot Arena use (a variant of) DPO on 2024/12/31.
This question is managed and resolved by Manifold.
Get
1,000 to start trading!
Sort by:
I think this market fundamentally cannot resolve as stated because Chatbot Arena includes closed LLMs which do not disclose the details of how they have been finetuned. GPT-4 and Bard might already be using DPO and we wouldn't know about it. To my chagrin, some people (for example Nous Research) categorize DPO as a variant of RLHF, so there's plausible deniability whenever OpenAI or Google refer to their finetuning as "RLHF"
@NoraBelrose that's a good point. If there's uncertainty about close source models at the end of the year I might resolve N/A or push the deadline in the hope that the information will surface later.
Related questions
Related questions
Will RL work for LLMs "spill over" to the rest of RL by 2026?
40% chance
Will DPO or an Explicitly DPO-based Technique be Used to Train a Public Frontier Lab LLM Before Jan 1 2025?
84% chance
Will the ωB97M-V functional (DFT) be widely regarded as obsolete by EOY 2027?
50% chance
What will Manifolders mostly use LLMs for, by EOY 2025?
Will we get a new LLM paradigm by EOY?
24% chance
Who will die by EOY 2025
Will everyone “Return To Office” (RTO) in 2025?
29% chance
Will Rung 6 density functionals be widespread by EOY 2025?
36% chance
Will a single model have all the upsides o1-style RL with none of the downsides at 2027?
58% chance
Will I still never get COVID though EOY 2025?
59% chance