Is “reasoning” mostly elicitation?
5
100Ṁ1312026
16%
chance
1H
6H
1D
1W
1M
ALL
One interesting research programme in 2025 suggests that RL on verifiable rewards (RLVR) actually doesn't add capability to a base model, but instead makes it easier to elicit existing capabilities.
Resolution: at the end of next year, will I put >66% that RLVR is bottlenecked on capabilities learned during pretraining?
My current credence (Dec 2025): 30%
If you want to use a model of me as well as your model of RLVR to answer, here are some of my views.
This question is managed and resolved by Manifold.
Get
1,000 to start trading!