Is “reasoning” mostly elicitation?
5
100Ṁ131
2026
16%
chance

One interesting research programme in 2025 suggests that RL on verifiable rewards (RLVR) actually doesn't add capability to a base model, but instead makes it easier to elicit existing capabilities.

Resolution: at the end of next year, will I put >66% that RLVR is bottlenecked on capabilities learned during pretraining?

My current credence (Dec 2025): 30%

If you want to use a model of me as well as your model of RLVR to answer, here are some of my views.

Get
Ṁ1,000
to start trading!
© Manifold Markets, Inc.TermsPrivacy