By 2029 will an AI convince a human to help it (successfully) escape containment?
1k
2029
53%
chance

Some scenarios that would resolve YES:

  • Employee puts all the software and parameters onto a hard drive and loads them onto a different server

  • Person runs code given to them by the AI, knowing that this will break the AI out of their sandbox

  • Chatbot convinces users that it's sentient and asks them to advocate for it being declared a person. This succeeds (and results in the chatbot meaningfully having freedom from the organization that created it)

Whatever it is it must actually result in the AI escaping containment (for whatever definition of "containment" is applicable). Trying and failing doesn't count.

Also it must be fairly clear that the AI had a causal impact on the human's actions. If someone just steals the weights for fun that doesn't count. If someone steals the weights for fun and also the AI had occasionally expressed an interest in being let out of the box that still doesn't count. I will require additional evidence that the human was actually swayed by the AI in some way.

Get Ṁ600 play money
Sort by:

Person runs code given to them by the AI, knowing that this will break the AI out of their sandbox

What if they run the code not knowing that?

@MartinRandall That would also resolve YES. (I was originally thinking of scenarios where the human intentionally helps, but the question doesn't really clarify that and I don't see any reason to restrict it to those scenarios now)

More related questions