Will reinforcement learning overtake LMs on math before 2028?
38
1kṀ4167
2028
70%
chance

Will a state of the art model on Hendrycks' MATH be trained for more FLOP on RL than it is on LM objectives? A purely RL model counts as well of course.

RL encompasses anything involving online learning or expert iteration-like etc. If this ends up being difficult to call because of some breakthrough in decision transformer style conditional imitation learning (ie something between rl and LMs), I will probably cancel the market as ambiguous.

When models approach 100% acc on MATH, a similar successor natural language math dataset will be used instead.

Get
Ṁ1,000
to start trading!
© Manifold Markets, Inc.TermsPrivacy