Will AI be Recursively Self Improving by mid 2026?
Basic
68
7.3k
2027
24%
chance

Ben Pace and Nathan Helm-Burger recently made a bet about whether AI systems will have begun to meaningfully improve themselves by August 23rd, 2026.

https://www.lesswrong.com/posts/btzdPwkmmcZ2KAQwm/debate-is-it-ethical-to-work-at-ai-capabilities-companies?commentId=MSahjmFwWRyeJNuqo

Nathan: "I believe we are already in a substantial hardware and data overhang, and that within the next 24 months the threshold of capability of LLM agents will suffice to begin recursive self-improvement. This means it is likely that a leader in AI at that time will come into possession of strongly super-human AI (if they choose to engage their LLM agents in RSI)"

Ben: "I don't expect that a cutting edge model will start to train other models of a greater capability level than itself, or make direct edits to its own weights that have large effects (e.g. 20%+ improvements on a broad swath of tasks) on its performance, and the best model in the world will not be one whose training was primarily led by another model. "

This market resolves YES if Nathan wins the bet, and NO if Ben wins. The resolution date is one year later than the cutoff so that information about whether meaningful self-improvement started prior to August 2026 can be taken into account.

Get Ṁ1,000 play money
Sort by:

If this article happens to be true does it resolve YES ? https://www.theinformation.com/articles/openai-shows-strawberry-ai-to-the-feds-and-uses-it-to-develop-orion

This is using a Sota model with something to advance reasoning to improve, by providing the training data, a frontier model. Probably strawberry is MCTS and RL on top of a base LLM

To me this is "self improvement" in that it's an automated process, its bootstrapping by using one cutting edge AI model to improve another. It is also recursive in that gpt-5 can be augmented with RL and MCTS to produce the training data for gpt-6.

Others might disagree in that it's not one monolithic AI - its several components, and it's not 100 percent automated. It would probably take 100s of thousands of people to do this work manually, and instead is requiring less than 1000 people, but that's not full automation. People are ordering each run, checking results, fixing bugs, trying again, for thousands of iterations over months. AI is likely even helping with this but the most talented engineers in the world are driving the effort.

So, just to be clear, this market resolves based on the determination of the judge(s) picked by Ben and Nathan, and so my opinion as market creator isn't that relevant.

That said, I think this is not sufficient by itself. As discussed earlier, my opinion is that generating training data isn't usually what's meant by RSI, at least on the vibes level. What's more centrally meant is algorithmic (or direct) improvement, such that version n is designing version n+1. That said, if on the deadline, models were vastly more powerful because of automated training data production, I'd probably suggest it should resolve YES anyway, because it's in line with the vision of the future Nathan presents.

Note : the algorithm described is where strawberry/q* is a self improving algorithm that can solve reasoning questions. It likely works similar to https://www.multion.ai/blog/introducing-agent-q-research-breakthrough-for-the-next-generation-of-ai-agents-with-planning-and-self-healing-capabilities

However the architecture was chosen entirely by humans. Gpt-5 is then memorizing/compressing the correct answers to very large numbers of outputs.

So yes it's not generating merely "training data", the smart but slow AI is strawberry, and then gpt-5s network is being used to compress the techniques learned by strawberry to something cheaper and faster to run. It is doing exactly what you are saying it isn't, Max.

If this works everyone will be doing it and yes we will see large advances over a short time period to probably human level or above across most questions that it is easily possible to check for correctness.

Possibly robotic problems are solvable with the same approach - bring in a simulator, use RL and MCTS to find a solution to each simulated task, then memorize the answer. If you go back and check the GATO paper is this.

My apologies for misunderstanding. The article you linked was paywalled and all of the strawberry stuff I've seen has basically been speculation. If you're right, I expect this to resolve YES.

It's a good discussion max and I have my doubts this is RSI myself. Humans are picking the architecture ("let's make the numbers defining it bigger"), doing countless debugging steps the current tools are too stupid to help with, building the data centers, etc etc.

Imagine we wrote some manifold bets on early aircraft in 1895 and we are now arguing if the Weight Brothers aircraft flies like a bird does.

@GeraldMonroe It is definitely in a murky grey zone, but my intuition is that just synthetic data generation doesn't quite count. I explicitly said in the bet that this market is determined by that murky grey-zone situations should resolve to NO. The scenario I have in mind for a YES resolution is something more like the AI being used as part of an 'AI scientist' system which pursues algorithmic improvements in a highly parallel search process involving formulating hypotheses and running experiments. See Bogdan's comment:

https://manifold.markets/MaxHarms/will-ai-be-recursively-self-improvi#qugx17dviw8

@NathanHelmBurger so even if humans are then cherry picking the best results of the experiments, then human engineers assisted by Cursor etc implement new AI models based on the new algorithms. Human engineers then keep the enormous GPU clusters running training the new models, fixing a steady stream of intermittent failures and maintaining the generators.

Finally after training human engineers then test the improved models, send access keys to NIST, the government has human experts inspect it, and finally after a human emails approval at several levels a human IT engineer launches a script, written with AI help, to give a subset of public human users access to the new model.

Closing the loop, a human makes a pull request on the AI scientist repo to have the AI system use the newly improved base model.

This is recursive self improvement by your current understanding and for the purposes of this bet?

Just checking. It's missing the "self" part of it. The "AI scientist" as a script around a base model isn't really a player in itself. It's not doing research because it wants to be smarter, its doing it because the Json file of next tasks to do instructs it to, and because we did a bunch of math to find a set of functions that will do what we (humans) want from the AI scientist.

@GeraldMonroe Thanks for the clarifying question. Yes, that counts in my view. If the clear majority of the intellectual work is being done by AI, the human input/guidance doesn't have to be zero or negligible to count. If the work being done is something like 75% AI, 25% human, then that seems to me like it would qualify. So this bet is about recursive-self-improvement that is still human-in-the-loop, not all the way to human-out-of-the-loop fully independent improvement.

Of course, I must reiterate that I'm not the judge, and I'm only one of the two parties to the bet. I'm just sharing my understanding and intuitions. Ultimately it won't be on me to make the call.

@NathanHelmBurger if X.ai employees are really a tiny crew and they all use cursor or similar tools would that count? Assuming it speeds up the easier part of "intellectual work" - turning ideas to code, turning results to a human interpretable form, reviewing work for mistakes.

I know culturally lesswrong believes the only intellectual work that matters is that tiny bit of human inspiration that is pivotal. For example when Edward Teller and Stanislav Ulman famous whiteboarded the radiation lens concept.

But what about all the other engineers who would have worked for months to design every detail of the actual device? Does their work count as intellectual? What saves you more money, automating the bulk of the work or the brilliance of the top of the pyramid staff?

Smarter tools have been leading to smarter tools at least since the first transistors.
AGI won't fundamentally change this exponential curve. Looking 10 years into the future, the change is going to be exponentially bigger than what we've seen over the past ten years, but we've been experiencing that for the past decades. From moment to moment it always seems slow except for a few breakthroughs here or there. But when you look back 20 years...
I think the reason for why there won't be a foom scenario is simply because there are physical restrictions to how much power you can dedicate to an AI system and how quickly you can scale that up.
Yes, there might be small bursts, e.g. when we or an A(G)I figures out how to wire up all devices on earth into one giant brain, or when robotics reaches the qualitative leap of being able to autonomously add new power sources (with the help of an AGI).
But if you zoom out, we will still sit on that same exact exponential.

Back to the original question: Bearish on direct weight changes and "just LLMs" being enough for RSI, but bullish on "a cutting edge model will start to train other models of a greater capability level than itself" and "the best model in the world will be one whose training was primarily led by another model".
I think 2026 is quite tight though... I would put a 90% probability on this happening by 2035, 10% by August 2026, but I'm voting yes because I want to be believe...

Isn't expecting AI to improve itself essentially the same as reading 200 pieces of homework from third-graders and expecting to somehow be able output sixth-grade homework as a result?

The idea here is that we might, for instance, create an AI model which is capable of understanding AI model weights, then such a model could generate updates to its own weights that might make itself smarter.

Or we could make a model that is taught to act as an AI researcher, then it might be able to find a way to build a smarter AI researcher and so on.

So the idea is that we create a third-grader that could teach other third-graders or hire a third-grade teacher?

opened a Ṁ19 YES at 14% order

Essentially. I think calling it a "third-grader" makes the idea sound more absurd than it really is.

I still think the "direct weight editing" version is unlikely because I think model weights tend to be inherently very hard to rationally understand for a series of reasons, but the "automated AI researcher" one is somewhat plausible - LLMs are currently limited to being as smart as the smartest humans, but if you could computationally scale the best human researchers, you'd be very likely to get something out of that.

@Creagle Seems like you could make the same argument that it's impossible for humans to learn things which they didn't already know.

Yes, you could make that argument if you were willing to ignore millenia of human history

@the_coproduct you claim "LLMs are currently limited to being as smart as the smartest humans" but that's untrue. LLMs are limited to generating outputs that can mimic the output of the smartest humans. If Einstein had a parrot, it may have been able to repeat equations but it wouldn't be as smart as Einstein

Yes, you could make that argument if you were willing to ignore millenia of human history

Yes, of course it's wrong, but why is the argument itself wrong? Originally all humans had an education no better than a third grader. How was it possible for humans to become like sixth graders, without any external help?

@singer Because humans have been able to capture knowledge, make inferences and test hypotheses. An LLM is just a glorified predictive text machine spitting out something that resembles what it's seen before

If I told an LLM agent like open-interpreter "capture knowledge about the output of the program named mystery.py, make inferences about how it works, and then test your hypotheses", and it did those things, would that be convincing to you?

Why would, in your analogy, the third grader be limited to third grade homework?

RSI means AI systems can watch humans doing tasks they are not able to do, read and analyze writings humans have made on the subject, etc. RSI works even before the AI system is better than humans in most domains.

It also means the AI systems could connect to real world robotics labs collecting new information with newly built (designed and manufactured by AI) equipment. This new information could be from fields beyond current human capabilities.

For example, design and build thousands of HTS fusion experiments, including designs humans considered but never had the resources to try.

See "GPTs are Predictors, not Imitators". Predicting something is much harder than imitating it.

bought Ṁ100 NO

“and the best model in the world will not be one whose training was primarily led by another model.” I don’t think this part has any chance of being true. Even if the best model uses recursive self improvement I think that kind of training will be relatively slow and will be dwarfed in updates tokens whatever by some sort of normal pretraining.

I think maybe you're confused here. I mean 'led by' as in, the foundational discoveries of algorithms / architectures used in the new model are ones where the bulk of the research to discover those developments was led by the predecessor AI. Not that the predecessor AI is going to be doing inference for each training step of the successor AI.

Ah, that seems more possible than what I had thought. How will this resolve if the predecessor is used in a sort of neural architecture search at massive cost that leads to a successful model but it’s questionable how much of the advantage came from the search/would have been found anyways?

Well, I can't say for sure, but here are some thoughts:
Maybe one frontier AI company will do this but another won't. If the company that did this has a model only slightly better than the one which didn't, I would count that as a No because it seems like the automated search resulted in only minor improvement. It needs to be clearly the new state of the art to count.

If all the companies attempt a very similar process, and all agree that it was either helpful or unhelpful, then it's harder to judge. You'd have to look at measures of capability and try to judge if the jump between model versions was higher than what would have been predicted from extrapolating previous trends.

If no company even makes the attempt, that counts as a No.

If a company attempts this, and then deploys the resulting model, and that model is clearly state of the art... but you are suspicious that it was not worth the cost of the search that they did. Well, that's again a fuzzy situation. I think I'd still call that a Yes in this case, if the company at least claims that they endorse the search they did (which they probably would externally just to save face, even if internally they decided it was a bad move).

I do however have a different market with a more specific and limited prediction which does take the expense into account: https://manifold.markets/NathanHelmBurger/gpt5-plus-scaffolding-and-inference

Thanks! With that I actually think there’s a good chance this market is underpriced at 15%. I think it’s likely they’ll try it, with some changes coming from humans and some from the search, and especially if it’s the largest model it’s unlikely there will be something better.