Will any agent perform better on Minecraft (or comparable open world game) after being fine-tuned on a manual by 2027?
Jan 1, 2027
M$106 bet
To clarify: the experiment is that there are two copies of an agent that runs on Minecraft (or some other open world game environment). The agent has the capacity to be fine-tuned with text. One version is passed a manual for the game as text (or text + images, but *not* video), the other runs without any finetuning. Will the former perform better than the latter (either better sample efficiency or better final reward)? The agent can't have been trained on that env before, but it can be trained on other envs/data beforehand (e.g. it's okay if there's a pretrained LLM in the loop).
I'm not sure that human agents perform much better given a manual. Maybe instead give the agent access to the Reddit for the game?
@MartinRandall I will accept essentially any text-based vaguely guide-like thing. The specific details of the text aren't what this question is getting at.
how does this resolve if no one attempts this experiment
@April If nothing like this gets attempted I'll resolve it N/A. I'm not very interested in the probability the experiment is performed at all.
James Babcock bought M$33 of YES
Publication bias means that if someone tries it and it doesn't work, they're likely to not report the result, whereas if they try it and it does work they certainly will.

Play-money betting

Mana (M$) is the play-money used by our platform to keep track of your bets. It's completely free for you and your friends to get started!