This will be evaluated according to the AI Safety Levels (ASL) v1.0 standard defined by Anthropic here.
This resolves based on a public disclosure by any credible research institution that they have access to a system that qualifies for ASL-3. The disclosure, or a follow-up statement by another credible research institution using their results, must explicitly reference the ASL framework—I will not directly litigate what counts. If there is reasonable disagreement in the comments (in my judgment) about what counts as credible, I’ll use a one-week Manifold poll (or similar mechanism as needed) to decide.
Note that the date in question is the date that the model that first reached ASL-3 finished training, not when the ASL-3 determination was made or reported. The market will not resolve until six months after the first qualifying disclosure, to leave room in case the first AI system to qualify was not the first to be disclosed.
Feel free to add new answer choices. Valid choices must be in the format YYYY QQ.