Similar in spirit to /singer/ai-beats-me-in-pvp-factorio-during . What makes this a challenge for AI is using the human keyboard/mouse interface and needing to read the screen pixels directly.
Things not allowed:
Any sort of preprocessing of the screen, e.g. using tiles or position annotations.
Getting to read the internal game state.
Mods.
A predetermined strategy or sequence of moves.
Not a specialized AI just for the game, needs to be able to talk like Claude/ChatGPT.
Things that are allowed:
Scaffolding, harnesses, etc..
Internet search
Pausing the game at any time to think/interact with it (issuing commands while paused is one of the legit game mechanics of FTL)
Evaluation:
Difficulty is easy. Advanced edition enabled. Winning = defeating the final boss. The system needs to win at least once in 5 trials. 1280x720 resolution. Random seed. Needs to finish game in under 24 hours. AI chooses its own ship (all unlocked).
After the market closes I'll use the most promising systems I can find that don't require any customization (beyond feeding input/applying output) to make them work with the FTL game. I may do some minor scripting to feed them screenshots and issue their key-presses/mouse movements.