
Resolves positively if there is an AI which can succeed at a wide variety of computer games (eg shooters, strategy games, flight simulators). Its programmers can have a short amount of time (days, not months) to connect it to the game. It doesn't get a chance to practice, and has to play at least as well as an amateur human who also hasn't gotten a chance to practice (this might be very badly) and improve at a rate not too far off from the rate at which the amateur human improves (one OOM is fine, just not millions of times slower).
As long as it can do this over 50% of the time, it's okay if there are a few games it can't learn.
@AdamK Have you seen Claude pays Pokemon? It's far worse than an amateur human, and the problem is planning, not interfacing with the game.
Also, the LLM approach is useless for realtime games where speed/reaction time is required, you can't exactly feed screenshots to it and wait for reply if it's a competitive shooter.
@ProjectVictory I think latency issues are one of the most plausible paths to AIs failing to meet the resolution criteria for certain classes of games. I'm not worried that AIs in 2028 will fail to plan well.
Above, I was mostly referring to the fact that plugging an AI up to a game requires custom scaffolding, so is not something people can easily do currently. Better general computer use in the next 3-6 months might get us to the point where everyday people can have it actually try to play games, however badly.
@AdamK six months is not a very long time. I'd bet I'm not playing Inflection Point with an AI by then.
@SemioticRivalry How will you update in the coming months depending on how sample-efficient RFT is? That's a short-term crux for me.
improve at a rate not too far off from the rate at which the amateur human improves (one OOM is fine, just not millions of times slower).
Is this measured in wall clock time or "gameplay time"? For example, AlphaZero matched Stockfish in 4 hours, but that was equivalent to tens or hundreds of thousands of games of self-play. Say the AI improves at human level in wall clock time but accomplishes it by playing many instances of the game on thousands of computers, possibly sped up. Does that count?
@MaxMorehead I’d assume gameplay time over wallclock time within reasonable limits.
Generally when people talk about RL being extremely data inefficient, they’re making the claim in terms of the necessity of a large number of rollouts, not in reference to wallclock time. Doesn’t make sense to focus on wallclock time when running twice as many instances gets you ~twice the RL data. It’d be weird for sample efficiency to be a function of compute allocation over time.
@AdamK This is what I think, considering the existence of Scott's other similar question. But I wanted some more confirmation before I upped my NO stake.
@MaxMorehead I'd guess Scott would use wallclock time, but not allow your parallel playing szenario. I assume the question's objective is to find out whether an AI beats some random human on >50% of games, when we just put 2 Computers next to each other, one controlled be the AI, the other controlled by the human.
@ScottAlexander I'm assuming this AI needs to play at real time considering it could play randomly selected multiplayer games?

welp cant exactly liquidate this market for charity anytime soon.
@DanW do I understand right? We can donate to charity now, and then buy mana at 1/10 the cost after May 1?
I'm only holding 2% of what you are, but I put up a limit order in case you go for it. I might lower it once I read the fine print on the pivot.
If those are the terms you won't be the only one trying to liquidate. We just have a coordination problem.