Will any new proof about the safety of transferring RL agents from one environment to another be published by March 2023?
9
160Ṁ365
resolved Aug 22
Resolved
YES

"Published" means in some kind of peer-reviewed outlet, OR by some established research group in whatever outlet they use, OR at my discretion. This rule is here to save me from checking someone's 100 page wordpress proof claiming to solve alignment, not because I care about the proof going through proper academic channels.

Examples:

  • An agent trained on environment A has a greater minimum reward on environment B than an agent trained on just environment B

  • An agent trained on environment A has a lower maximum impact on environment B than an agent trained on just environment B

  • Either of those but for some other safety property

Get
Ṁ1,000
to start trading!

🏅 Top traders

#NameTotal profit
1Ṁ168
2Ṁ3
3Ṁ0
© Manifold Markets, Inc.TermsPrivacy