Resolution Criteria
This market resolves to YES if Daniel Kokotajlo publicly states or writes that the state-of-the-art large language model release(s) in 2025 after April 3rd have caused him to increase his estimated timeline for the development of artificial general intelligence (AGI). This must be a clear statement attributing the timeline extension specifically to 2025 LLM releases.
(Increase means the amount of time it takes will increase; that is, things will go slower.)
The market resolves to NO if:
Kokotajlo does not make such a statement by the end of 2025
He explicitly states that 2025 LLM releases have not changed or have decreased his AGI timeline estimate
He makes no public comment on how 2025 LLM releases affect his AGI timeline
Kokotajlo's timeline increases, but not due to dissapointing LLM releases. Instead, something like a war in Taiwan or a market crash may make AGI development happen slower than currently expected by him.
This market resolves to NA if:
He makes no public comment on how 2025 LLM releases affect his AGI timeline
He makes conflicting public comments that make it difficult to determine his overall view
I will wait until the end of year 2025, because it is possible he changes his mind at various points through the year. For example, GPT-5 might initially be disappointing, but then later in 2025 Gemini 3.0 exceeds expectations.
Background
Daniel Kokotajlo is a researcher known for his work on AI alignment and forecasting AI development timelines. He has previously published analyses and predictions regarding the development of artificial general intelligence.
State-of-the-art (SOTA) large language models are the most advanced AI language systems available at a given time. GPT-5, Claude 4, DeepSeek-R2, Gemini 3.0, and Grok 4 are all models which will likely be released this year.
Daniel recently co-authored https://ai-2027.com on April 3rd, 2025. This question is aiming to basically ask, "will the models released after that report be less good than expected (to him)?"
Update 2025-04-10 (PST) (AI summary of creator comment): Clarifications from the Creator:
Statements where Kokotajlo indicates that the release has influenced (i.e., increased) his median forecast should be taken as a YES resolution if he uses terms such as slightly influenced, moderately influenced, significantly influenced, or completely influenced.
The term just barely influenced is not considered sufficient to trigger a YES resolution.
This might be a bit vague. For example, shortly after GPT-4.5 release Daniel said that he raised his median forecast from 2027 to 2028. He said that it was not a direct consequence of this release but significantly influenced by it, although he is struggling to exactly estimate how much (IIRC).
Would a similar statement be enough for YES?
@qumeric Hmmm yeah I think "significantly influenced" passes the threshold for YES.
But you are right that he could say anything in the range of "GPT5 just barely/slightly/moderately/significantly/completely influenced my median forecast."
Any ideas? I'm leaning towards resolving any of the above yes, except maybe "just barely."
@AdamK Good point, what I really meant was any new models released from this point in time going forward. Although I guess Llama 4 was only released a few days ago after the AI-2027 document. So I think this question specifically refers to models released after April 3rd. I will update the question
I find the "increase"/"decrease" wording unclear. Does "increase" mean a longer timeline or a faster timeline?