How many FLOPs will go into training the first ASL-3 model?
6
558Ṁ683
9999
1.6%
1e24
1.6%
3e24
8%
1e25
9%
3e25
6%
1e26
26%
3e26
30%
1e27
16%
Other

This will be evaluated according to the AI Safety Levels (ASL) v1.0 standard defined by Anthropic here. See this market for criteria for determining a system to be ASL-3 for the purposes of this market.

Once a system is determined to be ASL-3 by the criteria above, this will resolve after the first credible report about the amount of training computation (in FLOPs) used to train that system. If there is reasonable disagreement in the comments (in my judgment) about what counts as ‘credible’, I’ll use a one-week Manifold poll (or similar mechanism as needed) to decide.

If there is reasonable disagreement about how to estimate training FLOPs, I will aim to use a method that corresponds as closely as is practical to the one used in the most recent Epoch AI report on training compute as of the resolution date.

Valid options must be powers of 10 or powers of 30 (i.e., roughly half orders of magnitude), in 1eNN or 3eNN format.

  • Update 2025-11-24 (PST) (AI summary of creator comment): The creator has determined that Opus 4.5 qualifies as ASL-3 and plans to count it as the first ASL-3 model unless there are convincing arguments otherwise. The market will not resolve until a credible report providing the training FLOP count is available.

Get
Ṁ1,000
to start trading!
Sort by:

The system cards are fairly confusing around whether models have qualified, but on my reading, Opus 4.5 qualifies for ASL-3 significantly more clearly than prior models, and I plan to count it unless anyone argues convincingly for a different outcome.

In the section of the system card:
> As a result, we determined ASL-3 safeguards were appropriate.

That said, this won't resolve until there's a report out giving a FLOP number. (If you see one, LMK!)

© Manifold Markets, Inc.TermsPrivacy