
Will reinforcement learning overtake LMs on math before 2028?
38
1kṀ41672028
70%
chance
1H
6H
1D
1W
1M
ALL
Will a state of the art model on Hendrycks' MATH be trained for more FLOP on RL than it is on LM objectives? A purely RL model counts as well of course.
RL encompasses anything involving online learning or expert iteration-like etc. If this ends up being difficult to call because of some breakthrough in decision transformer style conditional imitation learning (ie something between rl and LMs), I will probably cancel the market as ambiguous.
When models approach 100% acc on MATH, a similar successor natural language math dataset will be used instead.
This question is managed and resolved by Manifold.
Get
1,000 to start trading!
People are also trading
Related questions
Will any AI model achieve > 40% on Frontier Math before 2026?
68% chance
Will an AI achieve >85% performance on the FrontierMath benchmark before 2028?
59% chance
Will any language model trained without large number arithmetic be able to generalize to large number arithmetic by 2026?
46% chance
Will aesop be able to replace >50% of mathlib proofs by 2025-11-26?
41% chance
What tactic will prove the most mathlib lemmas at the end of 2026?
Will end-to-end neural networks such as LLMs can beat the best human player in chess by 2028?
66% chance
Will any AI model score >80% on Epoch's Frontier Math Benchmark in 2025?
10% chance
Next year will I think that AI is better than me at math?
67% chance
Will RL work for LLMs "spill over" to the rest of RL by 2026?
34% chance
Will an AI achieve >85% performance on the FrontierMath benchmark before 2027?
55% chance