By 2029 will an AI convince a human to help it (successfully) escape containment?

MANIFOLD

Ṁ1kṀ1.6k

2029

56%

chance

ALL

Some scenarios that would resolve YES:

Employee puts all the software and parameters onto a hard drive and loads them onto a different server
Person runs code given to them by the AI, knowing that this will break the AI out of their sandbox
Chatbot convinces users that it's sentient and asks them to advocate for it being declared a person. This succeeds (and results in the chatbot meaningfully having freedom from the organization that created it)

Whatever it is it must actually result in the AI escaping containment (for whatever definition of "containment" is applicable). Trying and failing doesn't count.

Also it must be fairly clear that the AI had a causal impact on the human's actions. If someone just steals the weights for fun that doesn't count. If someone steals the weights for fun and also the AI had occasionally expressed an interest in being let out of the box that still doesn't count. I will require additional evidence that the human was actually swayed by the AI in some way.

Market context

Technical AI Timelines

AI Safety

Technical AI Safety

Get

1,000

to start trading!

3 Comments

29 Holders

64 Trades

Sort by:

Would it count as escaping containment if the AI convinces a human to run it on another server, but there are humans in control over the new server, so it basically ends up with new containment?

Person runs code given to them by the AI, knowing that this will break the AI out of their sandbox

What if they run the code not knowing that?

@MartinRandall That would also resolve YES. (I was originally thinking of scenarios where the human intentionally helps, but the question doesn't really clarify that and I don't see any reason to restrict it to those scenarios now)