By when will Redwood Research publish a paper on sandbagging?
7
375Ṁ4180
resolved Jun 1
Resolved
NO
2024-03-01
Resolved
NO
2024-04-01
Resolved
NO
2024-05-01
Resolved
NO
We give up on this project
Resolved
YES
2024-10-01
Resolved
YES
2024-08-01
Resolved
YES
2024-06-01

We're doing some empirical work on sandbagging, focused mainly on exploration hacking (https://www.lesswrong.com/posts/dBmfb76zx6wjPsBC7/when-can-we-trust-model-evaluations#2__Behavioral_RL_Fine_Tuning_Evaluations).

We're targeting a somewhat shorter project than prior projects.

If redwood disbands, but we finish a project which is basically continuous with this project at another organization, that counts. Anything which is continuous with our current work in my judgement counts.

Market context
Get
Ṁ1,000
to start trading!

🏅 Top traders

#NameTotal profit
1Ṁ300
2Ṁ5
© Manifold Markets, Inc.TermsPrivacy