Will an artificial agent ascend in NetHack before the end of 2024?
11
201
230
resolved Feb 8
Resolved
YES

Late in 2021, a competition was held at the NeurIPS conference to see if any team could build an agent capable of completing the well-known roguelike game NetHack. It also aimed to compare approaches using neural networks to those using symbolic logic. Ultimately, there were no complete runs of the game by any agent submitted, and symbolic algorithms significantly outperformed neural networks on the challenge's scoring criteria.

Machine learning, especially approaches based on neural networks, have made significant progress since then. Before the end of 2024, will anyone build an agent, neural network-based or otherwise, that successfully completes NetHack at least once?

Resolves to NO at midnight EST on December 31, 2024. Resolves to YES if, at any time before then, someone posts a link to evidence that this has been accomplished.

Get Ṁ600 play money

🏅 Top traders

#NameTotal profit
1Ṁ161
2Ṁ14
3Ṁ4
4Ṁ3
5Ṁ0
Sort by:

@SamuelRichardson Nice! Looks valid to me, so I'll be resolving this YES later today unless someone points out a reason to dispute this result.

Given that this prior work existed, I'm surprised that no one at that conference just ported it into the contest environment.

Since it looks like symbolic algorithms are way, way ahead of neural algorithms on this, I suppose the follow-up question would be, will anyone build a bot based on neural network techniques that ascends within the next couple of years? I might post a market for that one later, if I can figure out a good way to define the requirements.

@NLeseul For a little bit more context, the bot from 2015 made heavy use of a technique called "pudding farming," in which you spawn and sacrifice an infinite number of black puddings and amass a huge collection of ascension-critical items. Changes as of NetHack 3.6.0 have made "pudding farming" no longer a viable technique.

The full paper about the 2021 competition results did note the existence of that earlier bot (as well as another one called "SWAGGINZZZ" that used multiple NetHack instances over AWS to search for perfect RNG, and still needed a human to move it to a fountain at the beginning of the run). But it states that, to the authors' knowledge, no previous work had accomplished ascension on the latest version of NetHack (3.6.6, at that time).

I still intend to resolve this as YES shortly, since I did not stipulate what version this had to occur on in the original description. I probably should have, and I'll likely post a fresh market specific to NetHack 3.6.6 or later in a little while.

Is anyone working on that?

predicted YES

@jknowak deepmind's general agents will be able to handle this level of complexity by then.