Will a prompt for Tetris be found, such that either GPT-4V or Gemini Pro Vision place >50 pieces on average?
Dec 31

@Sheya and I tested GPT-4V and Gemini Pro Vision on the task of playing a specific implementation of Tetris and found that, with our best prompts, GPT-4V achieved ~21.2 pieces placed on average, while Gemini Pro Vision achieved ~19.96 pieces placed on average (we're using pieces placed because the models play so poorly to make "linear cleared" an ineffectual metric).
For more details, see this Twitter thread.

We have also announced a sliding monetary bounty, from 60 USD to 200 USD, for anyone who does significantly better than us at prompting either of those models; to qualify, they should achieve 30.2 pieces placed on average with GPT-4V or 27.96 pieces with Gemini Pro Vision.

This question is a stronger operationalization than (the weak end of) the bounty, asking whether a prompting setup will be found with which GPT-4V or Gemini Pro Vision achieve more than 50 pieces placed on average. Various common sense restrictions apply, such as it not being allowed to offload any part of the reasoning to some software other than these models. If you're unsure about whether something qualifies, please ask.

Resolves NO if no such prompting is found this year (2024).

Get Ṁ600 play money
Sort by:
predicts NO

The version of this question for 30 pieces instead of 50:

More related questions