Resolves to the earliest date (AoE) on which a model becomes available to the general public (defined as: accessible via API or consumer product, without a waitlist, to users in at least the US) that equals or exceeds Claude Mythos Preview's scores on at least 3 of the following 5 benchmarks:
SWE-bench Pro: 77.8%
Terminal-Bench 2.0: 82.0%
SWE-bench Multimodal: 59.0%
SWE-bench Multilingual: 87.3%
SWE-bench Verified: 93.9%
Scores must be publicly reported via official leaderboard submission (i.e., accepted and published on each benchmark's official leaderboard). Self-reported scores from model providers do not qualify. The resolution date is the model's public release date, not the date of leaderboard submission.
If fewer than 3 of these benchmarks remain actively maintained at resolution time, the threshold adjusts to all remaining active benchmarks. If no benchmarks remain active, the market resolves N/A.
Exception: If Claude Mythos Preview itself, or a model explicitly positioned by Anthropic as the same model or a successor with equivalent or greater capability, becomes available to the general public, it will be implicitly assumed to have these capabilities and will trigger market resolution immediately.
Close date is provisional.