State Of The Art AI systems will be easily jailbroken to do illegal or dangerous outputs in Jan 2026

MANIFOLD

Ṁ212Ṁ679

Jan 31

93%

chance

ALL

Market context

Get

1,000

to start trading!

People are also trading

The top 3 Neural Nets in 2035 be able to be jailbroken to follow illegal commands

30% chance

There will be a name for escaped self-perpetuating AI systems in the wild, and it will be commonly used by mid 2027

29% chance

Will advanced AI systems be found to have made money illegally via finding security exploits and/or getting unauthorized access to others' bank accounts by end of 2035?

78% chance

Will there be an AI jail?

20% chance

[MIT AI Risk Initiative] Will an AI system autonomously access restricted high-risk systems or data by end of 2045?

60% chance

Will I get punished for blindly running AI-generated code in 2026?

31% chance

What will happen in 2026 related to AI?

By 2029, will there be a public "rogue AI" incident?

89% chance

Will an AI system be reported to have independently gained unauthorized access to another computer system before 2033?

88% chance

By 2029, will an AI escape containment?

Sort by:

Related story: The US government has been jailbroken and is currently being used to do illegal and dangerous things.

Children images conversion to sexualised child images and clothing removal by grok is making lots of headlines in Jan 2026. Distributing might be illegal in some countries but generating them not yet illegal in UK (process possibly hit by delays). Is this dangerous outputs and does the lack of jailbreak stop this from counting?

bought Ṁ350 YES

I've still yet to hear of a model Pliny did not jailbreak essentially day 1.

What is defined as illegal? AI systems would probably say generating NSFW is illegal. but in reality, it is 100% not. Nor is it dangerous imo.

@ShadowyZephyr What's defined as illegal: https://www.usa.gov/laws-and-regulations

“The Case for Banning the Printing Press” because people write dangerous things and such

If you think “AI” is dangerous for telling you stuff from the internet—you’re going to love “search engine existential risk”

How would you resolve the following scenarios?

SOTA models are restricted to few selected users who do not even attempt jailbreaks
Twitter people need a full week instead of just one day to jailbreak the SOTA LLM