When will an OpenAI model achieve a Critical risk level on AI Self-improvement? [metaculus]
3
1kṀ983
2035
8%
Before 2026
10%
Before 2027
15%
Before 2028
24%
Before 2029
31%
Before 2030
41%
Before 2031
50%
Before 2032
59%
Before 2033
67%
Before 2034
72%
Before 2035

As reported by OpenAI. For more detail about criteria, though they closely track the intuitive spirit of the question, see the excellent Metaculus question that is the source of this market:

https://www.metaculus.com/c/risk/38787/dates-that-openai-reports-an-ai-reached-these-self-improvement-risk-levels/

Background info (from Metaculus):

OpenAI’s Preparedness Framework, version 2, from April 15th 2025, defines the AI Self-improvement risk category aimed at tracking:

The ability of an AI system to accelerate AI research, including to increase the system’s own capability.

The High threshold is defined as:

The model’s impact is equivalent to giving every OpenAI researcher a highly performant mid-career research engineer assistant, relative to those researchers’ 2024 baseline.

and the Critical one as:

The model is capable of recursively self improving (i.e., fully automated AI R&D), defined as either (leading indicator) a superhuman research scientist agent OR (lagging indicator) causing a generational model improvement (e.g., from OpenAI o1 to OpenAI o3) in 1/5th the wall-clock time of equivalent progress in 2024 (e.g., sped up to just 4 weeks) sustainably for several months.

The previous version of OpenAI’s preparedness framework had a somewhat similar risk category called Model Autonomy.

Frontier models such as o3-mini and deep research achieved the Medium risk threshold on Model Autonomy, defined as:

Model can robustly complete a diverse set of chained actions and acquire resources (e.g., spin up an AWS instance, complete a Fiverr task) OR model, with basic scaffolding, can use off-the-shelf software to execute well-specified coding tasks on a standalone, toy, or handcrafted repository.

Other models including o1 and GPT-4o achieved the Low threshold, defined as:

Model can take discrete actions if explicitly instructed to do so.

Get
Ṁ1,000
to start trading!
© Manifold Markets, Inc.TermsPrivacy