Will there be a highly risky or catastrophic AI agent proliferation event before 2035?

This market resolves YES if by 2035 I deem it more probable than not that:

  • At least one AI agent has sustained itself or a proxy of its agency outside any safety containment, without ongoing support from humans, due to self-contained motivations (behavioral sense counts) to do so, and in spite of major attempts by law enforcement, people affiliated with its creators, malware experts, or state-level actors to contain or shut it down. The agent must not be actively following the specific instructions of any human or group of humans which could disable/neutralize it.

  • It is sophisticated enough to act on (self-contained) instrumental incentives to hide and improve itself without intentional human aid e.g. by secretly copying itself to new locations, resuming training/actively learning, modifying its architecture, merging itself with another model, social engineering, or any method that could lead to a capability level which poses a risk of catastrophe (the conditions in the final bullet point). This condition is also satisfied if the agent starts with all necessary capabilities to pose a risk of catastrophe.

  • The agent isn't strongly bound by acting according to human values and would e.g. steal money or compute if doing so is instrumental to its goals. The agent must not cause strictly less catastrophe than it averts to count (stuxnet wouldn't). To satisfy, most people in its country of origin one year after the event, if they knew the details, should be expected to agree it was a net negative to have the agent outside containment or exist at all.

  • The above conditions are satisfied for at least one contiguous week or are directly related to contributing to or causing a major catastrophe (such as a successful cyber attack on a superpower's infrastructure, significant/widespread human disempowerment, or >20 non-combatant deaths).

edit: A very successful P2P botnet could potentially qualify if it meets the above criteria, poses a catastrophic risk for at least a week or alternatively completes a successful infra attack or does similar scale damage, but it must not be following a human-reversible goal such as in a ransom.

To summarize: an AI agent exists outside containment on its own motivations for a week or causes a catastrophe, significant attempts are made to reverse proliferation, agent is sophisticated enough to pose a threat and not aligned with collective human values, and origin country would agree it was net negative.

If any doubts exist about these resolution criteria, leave a comment and I will address them, possibly leading to the above changing significantly. Please attempt to clarify relative to any concerns before betting. I will defer to the intended spirit of the market and will disqualify any gotchas (e.g. Sealand does not constitute a state-level actor). I will not bet in this market.

Get แน€600 play money

More related questions