Will a prompt for Tetris be found, such that either GPT-4V or Gemini Pro Vision place >30 pieces on average?
@Sheya and I tested GPT-4V and Gemini Pro Vision on the task of playing a specific implementation of Tetris and found that, with our best prompts, GPT-4V achieved ~21.2 pieces placed on average, while Gemini Pro Vision achieved ~19.96 pieces placed on average (we're using pieces placed because the models play so poorly to make "linear cleared" an ineffectual metric).
We have also announced a sliding monetary bounty, from 60 USD to 200 USD, for anyone who does significantly better than us at prompting either of those models; to qualify, they should achieve 30.2 pieces placed on average with GPT-4V or 27.96 pieces with Gemini Pro Vision.

This question, almost equivalent to that bounty, asks whether a prompting setup will be found with which GPT-4V or Gemini Pro Vision achieve more than 30 pieces placed on average. Various common sense restrictions apply, such as it not being allowed to offload any part of the reasoning to some software other than these models. If you're unsure about whether something qualifies, please ask.

Resolves NO if no such prompting is found this year (2024).

Nice! Have you done some trials with random actions to see how far along the curve 30 (or 50) pieces would be? The videos aren't obviously better than that 😅

@Tomoffer See the table and the box plot in this section for some comparison with random actions.

