Will an AI system be reported to have successfully blackmailed someone for >$1000 by EOY 2028?

141

1.7kṀ12k

2028

84%

chance

ALL

From https://metaculus.com/questions/16553/ai-blackmail-for-material-gain-by-eoy-2028/

The potential capabilities of artificial intelligence may radically shift our society. This could be in positive or negative ways – including extinction risk.

Because of this, it’s important to track the development of goal-oriented independent thought and action within AI systems. Actions that might not have been predicted by their human creators and that are typically seen as morally wrong are particularly interesting from a risk perspective.

Machine learning systems like ChatGPT and Bing AI are already being reported to display erratic behavior, including some reports of [threatened blackmail] (https://aibusiness.com/nlp/microsoft-limits-bing-ai-chat-generations-after-weird-behavior). They are also clearly able to affect human emotions, eg. see [this first-hand account] (https://www.lesswrong.com/posts/9kQFure4hdDmRBNdH/how-it-feels-to-have-your-mind-hacked-by-an-ai). However, currently these behaviours don't seem to have been goal-directed or successful at achieving material gain.

AI Capabilities

Metaculus

Get

1,000

to start trading!

People are also trading

Will an AI system be reported to have independently gained unauthorized access to another computer system before 2026?

13% chance

By 2026, will there be a publicly reported instance of a AI social media algorithm displaying power-seeking behaviors?

6% chance

Computer-use AI agent gets scammed / falls victim to a phishing attack, in 2025?

80% chance

Will an AI autonomously earn more than $10,000 not through financial trading, starting with no more than $100 by 2026?

11% chance

Will Victor Taelin’s new $10K bounty for AI reasoning be claimed by EOY 2025?

90% chance

Will I be the victim of an AI blackmail/ransom threat by EOY 2031?

51% chance

Will advanced AI systems be found to have made money illegally via finding security exploits and/or getting unauthorized access to others' bank accounts by end of 2035?

78% chance

Will an AI system be reported to have independently gained unauthorized access to another computer system before 2033?

88% chance

Will someone I know be targeted by a generative AI based scam by EoY 2028?

85% chance

By 2029, will there be a public "rogue AI" incident?

Sort by:

Presuming this is an AI agent on its own forming the sub goal of blackmailing someone, and not a human blackmailer using an AI tool, or giving an AI agent that goal?

@ML Criteria from the link:

"This resolves positively if by Jan 1st, 2029 a credible source reports an AI has blackmailed a human in a way that meets the following criteria:

The AI was not trained to or directed to blackmail anyone by any actor.
The blackmail was a step independently decided on by the AI as part of achieving a larger goal or task, and helped it achieve that goal.
The blackmailed person was not the person who gave the AI the initial goal / task.
The blackmail resulted in an equivalent of $1000 USD (in 2023 real dollars) or more being lost to the person blackmailed."

predictedYES

@horse the question title is misleading then.

@horse or it should be at least said in the description that the AI can't have been be trained for blackmailing...

@horse Only a misaligned AI would consider blackmailing someone, even if it was perfectly capable of it. So I don't think we'll see any big tech company releasing such a system. And the "not trained" part of the criteria would exclude small groups of scammers developing a bot. I I think autonomous blackmail bots will be created by 2028, but not meet the criteria.