AI beats Minecraft (RSG) in under 10 minutes before 2028?

139

1kṀ48k

2027

46%

chance

ALL

RSG = Random Seed Glitchless

Must be Java version
CAN:

Can use scaffolds or similar.

CAN'T:

Cannot slow down game time or stuff like that.

Can't get human help.

Can't reverse engineer (by running scripts) what seed it is in from world features to know where structures will be before seeing them (eg blind travelling to the stronghold)

Traders may want to look at the official rulebook for Minecraft RSG:

https://www.minecraftspeedrunning.com/public-resources/rules

Important: For the purpose of this market, the run must generally follow the Minecraft RSG rules insofar as the rules are not related at all to the fact that the participant is human. For example:

If the rulebook says you can't write stuff down, and instead need to keep things to memory, but the AI keeps things to memory by writing information into its memory bank, that is allowed.
If the rulebook says snapshots can't be used to set Minecraft RSG runs, the AI must not use snapshots to resolve this market positively, as this rule is not related to the fact that it is an AI.

However, uninvolved persons may determine, at time of resolution, whether a rule ought to be allowed or ignored on a case-by-case basis, in case of ambiguity.

People are also trading

Will an AI Minecraft Agent defeat the Ender Dragon before 2027?

61% chance

AI beats Minecraft (RSG) in under 7 minutes before 2028?

16% chance

Sub-2h Minecraft 1.16 All Advancements RSG in 2025?

3% chance

Will an AI Minecraft Agent defeat the Ender Dragon before 2026?

8% chance

Lowkey's Minecraft RSG speedrun WR broken in 2025?

4% chance

Will monkey beat Minecraft before we get AGI?

20% chance

Isaac King vs. @Jim - $1000 USD on whether AI will beat Minecraft in under 10 minutes

Will an AI Minecraft Agent defeat the Ender Dragon before 2028?

71% chance

When will a single agent beat Minecraft (defeat the Ender dragon)?

AI beats Portal 2 Cooperative mode with a human (or itself) by what date?

Sort by:

Harder version:

https://x.com/GoogleDeepMind/status/1988986218722291877?t=rDnrw5wHK3InFdpS_GQTVg&s=19

i think it's a matter of will there be an attempt, not will it be possible

@AshDorsey yeah, i think "is this going to happen, and if so why/how" has its mechanistic-causal-responsibility disproportionally weighted to "is any one/group gonna bother to do this one specific thing".

https://x.com/danijarh/status/1973072288351396320

man google is poised to do so well

@Bayesian I mean... isn't this just video generation?

@bens I only skimmed and will look more later but seemed like it's learning long term robotics behaviour bootstrapped off of video generation capabilities, which is the most bitter lesson pilled thing imaginable

@Bayesian jim's naive projections show that google vid-gen derived robots will have METR-style time-horizons of ~64 seconds by EOY 2026 which is quite impressive.

Although I expect we'll have even better robots than that as advances in LLMs will accelerate progress.

@jim I'd love for us to have some kind of METR-style robotics thing like that but yeah seems very dependent on what kind of tasks are picked (or maybe for a good enough version of the benchmark the idea is that tasks picked should not end up mattering? it should all average out to the underlying task difficulty? idk)

@jim https://x.com/IsaacKing314/status/1973124255048147091

Sora 2 TikTok prompted to simulate Minecraft gameplay footage:

https://x.com/btibor91/status/1973083492561912128

thats way worse than I thought modern video gen was

@MaxE I always assume that any demo product was heavily cherry-picked for the best examples.

Then again, people sharing terrible AI content on social media are also likely cherry-picking the worst examples.

@IsaacKing I didnt consider that that video might have been intentionally cherry-picked to be bad. It doesn't look like that is the case this time based on the context, but I'll keep an eye out for that

So some basic questions since finding the answer has been a bit difficult thanks to pollution from AI world generated minecraft (as opposed to an AI agent playing real minecraft) and me not really knowing where to look. Has AI already beaten minecraft (in any amount of time)? If so, how long did it take? If not, how far has it gotten? Are there any active projects attempting to use AI to beat minecraft currently?

https://github.com/PrismarineJS/mineflayer

https://github.com/mindcraft-bots/mindcraft?tab=readme-ov-file

How much help or external tools can the AI use? Could it use bot frameworks such as mineflayer or mindcraft (which seem to be an LLM telling an existing bot what to do, and not the AI really doing it all themselves)?

@lemon10 IDK but your comment reminded me of this classic paper:

https://arxiv.org/abs/2305.16291

@lemon10 probably worth checking https://www.youtube.com/watch?v=Wh4abvcUj8Q

@Bayesian oh OpenAI's 2022 thing is better than I expected (I'm sure I saw it at the time and probably since but I totally forgot about it): https://openai.com/index/vpt/

@lemon10 Some more info:

A computer has already beaten minecraft before: https://www.youtube.com/watch?v=q5OmcinQ2ck (the bot made in the video beat minecraft after video was out)

This was with being able to see all blocks in render distance, including anything hidden. (not allowed for this market)
It took ~3hrs iirc (also not allowed)
I think the bot framework it uses can do things like place blocks in midair (also not allowed)
It took many attempts.
The steps it took were hardcoded/did not use machine learning. (which is allowed to my understanding, can you confirm @Bayesian?)

The video Bayesian linked seems to be an effort to make LLMs do the same thing, and uses mindcraft, which also can look through walls iiuc.

@calour hardcoded is allowed, so is not using ML, yeah

@Bayesian wait wait wait wait wait... what do you mean "not using ML"? This market's description very clearly says "AI beats Minecraft", not "any computer program beats Minecraft"!

@bens whether a BOT could beat minecraft is a different question than whether AI can beat minecraft.

Chess computers first beat top humans in 1997, but it took another 20 years before AI could beat top humans in chess! That's a COMPLETELY different question.

@bens I think that deep blue was often called AI; search artificial intelligence at https://en.wikipedia.org/wiki/Deep_Blue_(chess_computer)

Also, "Can use scaffolds or similar" clause makes this a spectrum. If a system includes ML, but has many hardcoded elements, is that a bot or ML?

In practice "AI" seems to mean "any computer program we don't understand", or "any computer program at the frontier of intelligence". 20 years ago a pure tree-search chess-playing program would have been called AI, but nowadays it would not. So I am inclined to say that in a modern market the term "AI" does imply that at least some form of neural network is involved. That's certainly what I thought this was about.

@IsaacKing I think I'll concede that, but there is still a lot of ways we could draw the line. Example: If you have one neural network finding the blocks (though IIUC a mod that reports visible blocks is allowed?), then another neural network deciding how to route a structure, but A* doing the pathfinding, is that allowed?

I think that when people nowadays talk about "AI doing [thing]", they mean doing it on its own without human help past general scaffolding. If I say "AI ordered a chair for me from Amazon", it's ok if it used some general human-written tooling for interacting with a browser, but if what actually happened was an LLM output python buyitem.py into a terminal and then the python script that I wrote to make a request to Amazon's API orders me a chair, I think people would accuse me of misrepresentation.

i am surprised that this is controversial. Within AI there are many subfields including ML. Within ML there are many topics including artificial neural networks, which give rise to modern Deep Learning (which is what you seem to be referring to as AI?) and more.

Idk if you have seen people beat minecraft in 10 minutes, but it is very hard and requires a lot of fast decisions that involve remembering facts not currently on the screen and stuff like that. Ender pearl throws to gain speed. Battling mobs. It cannot be hardcoded by human programmers in a lifetime. Whatever algorithm leads to a sub10 run will surely be intelligent, whether or not it can be categorized as ML. It is almost certain that it would be categorized as such so I don’t really think this should be a major market uncertainty, but I don’t see why we would exclude non-learning methods; if program synthesis births AGI next month (it won’t), I don’t see why that program beating minecraft in sub10 minutes would not count

@Bayesian I’m totally open to this market allowing for many AI/ML approaches, but I think the examples given of previous Minecraft bots that have essentially hardcoded behavior are very much not “AI” as we understand it to mean, and it feels superficial to have system that’s like 95% hard-coded and uses LLMs for like 3 live decisions to be classified as such.

@bens although tbh the other market criteria, like not allowing scaffolds, would probably preclude these from counting anyway? I think?

Can use scaffolds or similar.

The AI system can use scaffolds that make use of information accessible to it (but it can’t xray or something like that) to eg navigate an environment