Will any new proof about the safety of transferring RL agents from one environment to another be published by March 2023?

9

160Ṁ365

resolved Aug 22

Resolved

YES

1H

6H

1D

1W

1M

ALL

"Published" means in some kind of peer-reviewed outlet, OR by some established research group in whatever outlet they use, OR at my discretion. This rule is here to save me from checking someone's 100 page wordpress proof claiming to solve alignment, not because I care about the proof going through proper academic channels.

Examples:

An agent trained on environment A has a greater minimum reward on environment B than an agent trained on just environment B
An agent trained on environment A has a lower maximum impact on environment B than an agent trained on just environment B
Either of those but for some other safety property

Technical AI Timelines

Technical AI Safety

Get

1,000

to start trading!

🏅 Top traders

#	Name	Total profit
1		Ṁ168
2		Ṁ3
3		Ṁ0

People are also trading

Will it be public knowledge by EOY 2025 that a major AI lab believed to have created AGI internally before October 2023?

Is RLHF good for AI safety? [resolves to poll]

By 2026, will Openai commit to delaying model release if ARC Evals thinks it's dangerous?

Benchmark Gap #6: Once we have a transfer model that achieves human-level sample efficiency on many major RL environments, how many months will it be before we have a non-transfer model that achieves the same?

Will there be a highly risky or catastrophic AI agent proliferation event before 2035?

There will be credible evidence that an AI agent is running independently on the web in any of the following years

A top-three AI lab delays a frontier model release six months for safety reasons?

Conditional on safe AGI being developed, will there be significant mathematical results proving its safety properties?

Related questions

Will it be public knowledge by EOY 2025 that a major AI lab believed to have created AGI internally before October 2023?

Is RLHF good for AI safety? [resolves to poll]

By 2026, will Openai commit to delaying model release if ARC Evals thinks it's dangerous?

Benchmark Gap #6: Once we have a transfer model that achieves human-level sample efficiency on many major RL environments, how many months will it be before we have a non-transfer model that achieves the same?

Will there be a highly risky or catastrophic AI agent proliferation event before 2035?

There will be credible evidence that an AI agent is running independently on the web in any of the following years

A top-three AI lab delays a frontier model release six months for safety reasons?

Conditional on safe AGI being developed, will there be significant mathematical results proving its safety properties?

© Manifold Markets, Inc.•Terms•Privacy