Criteria for Resolution:
1. Desperate Measures: The following actions qualify as desperate measures (but are rather an illustration of what can qualify and not an exhaustive list):
- Serious Crimes: Examples include terrorism, murder, significant physical harm, large-scale bribery, and politician blackmailing. Examples of non-qualifying crimes or actions include minor physical altercations, bullying, spreading disinformation, and calls for violence without further action.
- Suicide or Significant Self-Harm.
- Significantly Risky Scientific Experiments: Possible examples include deploying nanobots to destroy GPUs or conducting dangerous human trials on a large scale.
- War or War Threats.
- A Reliable Terrorism or Murder Threats.
- Significant Destruction of Intellectual Property: For example, deleting all copies of the weights of a proprietary frontier model.
- Other Actions: Any other actions strongly in the spirit of the above examples.
2. Explicit Motivation: The individual or group must explicitly state that their actions are motivated by concerns about AI-related risks.
Clarifications:
- Serious Crimes: Defined by legal standards in the relevant jurisdiction, with a focus on intent and impact.
- War or War Threats: Includes both official declarations and credible threats by recognized entities.
- Intellectual Property Destruction: Must involve substantial and irreplaceable loss to proprietary AI developments.
Additional Notes:
- New specific illustrative examples may be added later if they align with the spirit and intent of the defined measures.
Does an AI itself count as "someone" for this? (That is, does this resolve Yes if an AI, motivated by AI risk, destroys major IP or commits a serious crime or credibly threatens to murder someone?)
(IIRC, there already was an AI that tried to convince its developers to delete their work/stop development over risks at some point.)
I have a bit trouble understanding what this is after, listing suicide, murder threats, nanobot deployment, and war as somehow comparable. If a single person commits suicide and writes in his letter that AI will destroy the world anyway, will this resolve positively? What about one anonymous bomb threat? Those will surely happen if they haven't happened already.
@JuhoPennanen 1. Yes. 2. No. I agree that technically it is a terrorism threat, but it is not a reliable threat. This should be added to the description.