In 2028, will an AI be able to play randomly selected computer games at human level without getting to practice?

455

2.8kṀ370k

2028

60%

chance

ALL

Resolves positively if there is an AI which can succeed at a wide variety of computer games (eg shooters, strategy games, flight simulators). Its programmers can have a short amount of time (days, not months) to connect it to the game. It doesn't get a chance to practice, and has to play at least as well as an amateur human who also hasn't gotten a chance to practice (this might be very badly) and improve at a rate not too far off from the rate at which the amateur human improves (one OOM is fine, just not millions of times slower).

As long as it can do this over 50% of the time, it's okay if there are a few games it can't learn.

ACX

Scott Alexander's 5 year predictions

Get

1,000

to start trading!

People are also trading

In 2028, will an AI be able to play randomly-selected computer games at human level, given the chance to train via self-play?

76% chance

Will an AI be able to play a type of video game that it wasn't trained on before 2026?

26% chance

Will an AI be able to play 3-person Monopoly Deal or an equivalent card game at a superhuman level by the end of 2025?

76% chance

By 2029, will an AI be able to generate Video Games comparable to ~2023 'AA' Mid Market Games?

47% chance

Will computer-using AI agents play Balatro better than me by eoy 2025?

18% chance

By 2029, will AI be able to generate Video Games comparable to ~2023 Indie Games?

74% chance

By 2028, AI can play new levels of "Angry Birds" better than the best human players.

77% chance

Will an AI by OpenAI beat a super grandmaster playing chess by 2028?

57% chance

Will an AI capable of playing Kerbal Space Program (1 or 2) at a proficient human level exist by the end of 2028?

62% chance

Will general purpose AI models beat average score of human players in Diplomacy by 2028?

Sort by:

bought Ṁ50 NO

Why is this trading up, has some progress been made?

@benjaminIkuta

insider trading... hopefully?

Gemini beat Pokemon, but that should have been priced in since it was making steady progress for a while.

The fact this was trading below 50% for this one seems surprising, considering "play video games" is a concrete external reward (the kind reasoning models excel at) and multiple major labs are clearly focused on this. Also 50% of games and amateur human are highly achievable targets.

edit:
I didn't even notice the additional "Its programmers can have a short amount of time (days, not months) to connect it to the game" in which case the scaffolding for Gemini plays Pokemon might not even be "cheating"

@LoganZoellner if you're surprised it's below 50%, what solution do you expect to exist for real time games?

@ProjectVictory

A multimodal transformer trained with reinforcement learning on a few thousand video games. It would surprise me if Google and OpenAI weren't both already working on this internally.

@LoganZoellner this solution is currently about two orders of magnitude too slow for anything realtime. To play a first person shooter somewhat competently you need latency of about 300ms at the very minimum. Transformers like Claude and Gemini take tens of seconds to make a move when playing Pokemon, keep in mind that pokemon is on the easiest end in terms of how hard it is to parse visually, so you can't just throw a super lightweight model at the problem.

Is it currently possible for me to get an AI to play with me at all, let alone well?

opened a Ṁ1,000 YES at 60% order

@benjaminIkuta Wait a few months for better computer use.

opened a Ṁ100 YES at 40% order

@AdamK Have you seen Claude pays Pokemon? It's far worse than an amateur human, and the problem is planning, not interfacing with the game.

Also, the LLM approach is useless for realtime games where speed/reaction time is required, you can't exactly feed screenshots to it and wait for reply if it's a competitive shooter.

@ProjectVictory I think latency issues are one of the most plausible paths to AIs failing to meet the resolution criteria for certain classes of games. I'm not worried that AIs in 2028 will fail to plan well.

Above, I was mostly referring to the fact that plugging an AI up to a game requires custom scaffolding, so is not something people can easily do currently. Better general computer use in the next 3-6 months might get us to the point where everyday people can have it actually try to play games, however badly.

@AdamK six months is not a very long time. I'd bet I'm not playing Inflection Point with an AI by then.

@AdamK so once a few months passes, you'll update down?

@benjaminIkuta Yes, my short timelines largely depend on seeing impressive returns to scaling RL.

@ProjectVictory the LLM can code a simple bot that plays a first person shooter, and iterate on the code in the background.

@MartinRandall can it though? According to the resolution criteria "It doesn't get a chance to practice". And writing a bot that used screen capture to play a shooter is anything but simple.

@ProjectVictory it realizes it's a fps in the intro cutscene and deploys a standard fps bot. After that it just needs to adjust the program on what to shoot.

@MartinRandall what's "a standard fps bot"? Can you give me an example of such a program? Or do you expect an LLM to write one from scratch in the time it takes for a game to load?

@benjaminIkuta

> Is it currently possible for me to get an AI to play with me at all, let alone well?

The best model currently accessible is probably UI-Tars . Claude and Gemini have both been making progress playing Pokemon, but I would argue neither of them really count because Pokemon is a turn-based game and the AIs get significant scaffolding in order to be able to play.

@LoganZoellner it's really cool that this is possible at all with a local model, but it takes minutes to do simple tasks in browser so probably not a viable Minecraft buddy yet

https://www.reddit.com/r/LocalLLaMA/comments/1k665cg/anyone_try_uitars157b_new_model_from_bytedance/

@ProjectVictory Relevant market https://manifold.markets/AdamK/before-2027-will-i-enjoy-playing-mi?r=QWRhbUs

@AdamK "Wait a few months for better computer use." Okay, it's been a few months now.

@benjaminIkuta Speaks to the inefficiency of Manifold operationalizations that there weren’t better computer use questions for me to lose mana on

Obviously.

We're talking about an AI system that could be playing anything from Starcraft, to Baba is You, to Quake, to Infinifactory at human level in real time without any training... this is essentially the strong form of AGI.