Will any new proof about the safety of transferring RL agents from one environment to another be published by March 2023?
9
9
160
resolved Aug 22
Resolved
YES

"Published" means in some kind of peer-reviewed outlet, OR by some established research group in whatever outlet they use, OR at my discretion. This rule is here to save me from checking someone's 100 page wordpress proof claiming to solve alignment, not because I care about the proof going through proper academic channels.

Examples:

  • An agent trained on environment A has a greater minimum reward on environment B than an agent trained on just environment B

  • An agent trained on environment A has a lower maximum impact on environment B than an agent trained on just environment B

  • Either of those but for some other safety property

Get Ṁ200 play money

🏅 Top traders

#NameTotal profit
1Ṁ168
2Ṁ3
3Ṁ0
Sort by:

I will be reading through the submitted papers/checking for others soon. If anyone would like to make sure I look at a particular paper now is the time.

bought Ṁ76 of YES

Do any of these count?
https://openreview.net/forum?id=o8vYKDWMnq1
Neurips'22: Gives minimal conditions for small neural networks to achieve sublinear regret (in other words, it eventually stops doing suboptimal things).

https://openreview.net/forum?id=Ls0yzIkEk1
Published in NeurIPS '22, finds a bound on the probability their generalized value functions is off by \epsilon.

https://openreview.net/forum?id=VYYf6S67pQc
NeurIPS '22: Shows that their offline RL algorithm (ie no on-policy data, no ability to generate more) both converges to a unique fixed point, and that the policy it converges to does at least as well as the policy used to generate the offline RL algorithm, under certain regularity conditions. Not sure if this counts, but pretty sure behaving well "off-distribution" counts as generalizing to a new environment.

https://openreview.net/forum?id=lMO7TC7cuuh

Submitted to ICLR'23. Proves that under certain regularity conditions, deep Q functions generalize well off distribution as long as the extrapolated data is sufficiently close (in some sense) to the convex hull of the in distribution data. Paper maybe 50% to get in based on review scores, but it's by Tsinghua researchers, so maybe it still counts even if it gets rejected? (https://arxiv.org/abs/2205.11027)

@LawrenceChan I am slightly torn on counting offline RL but overall yes I will accept these. Also sorry for taking forever to resolve this market.

Can you give examples of work that falls into this category?

@LawrenceChan To my knowledge no one has made progress on this problem

What counts as a proof?


@LawrenceChan Mathematicians reading it agree that it proves the thing it claims to prove. Do you have a particular edge case you're curious about? I don't want to try to enumerate them all.

bought Ṁ40 of YES

@vluzko There's plenty of papers with proofs in the general cateogry of Safe/Transferrable RL:
https://arxiv.org/abs/1712.06924
https://proceedings.mlr.press/v139/kostas21a.html
https://proceedings.mlr.press/v162/sootla22a.html

(I'm sure I could find more if I spent 15-20 minutes looking into it)

@LawrenceChan Thanks! I'm not particularly surprised there's existing work (paper 2 definitely counts for the purposes of this question, haven't finished the other two). The question is specifically (and deliberately) about new proofs though, so these do not resolve the market.

predicted YES

@vluzko Yep, I was just checking if they count.

More related questions