Will a prompt for Tetris be found, such that either GPT-4V or Gemini Pro Vision place >30 pieces on average?

1.4kṀ3525

resolved Jan 1

Resolved

ALL

@Sheya and I tested GPT-4V and Gemini Pro Vision on the task of playing a specific implementation of Tetris and found that, with our best prompts, GPT-4V achieved ~21.2 pieces placed on average, while Gemini Pro Vision achieved ~19.96 pieces placed on average (we're using pieces placed because the models play so poorly to make "linear cleared" an ineffectual metric).
For more details, see this Twitter thread.

We have also announced a sliding monetary bounty, from 60 USD to 200 USD, for anyone who does significantly better than us at prompting either of those models; to qualify, they should achieve 30.2 pieces placed on average with GPT-4V or 27.96 pieces with Gemini Pro Vision.

This question, almost equivalent to that bounty, asks whether a prompting setup will be found with which GPT-4V or Gemini Pro Vision achieve more than 30 pieces placed on average. Various common sense restrictions apply, such as it not being allowed to offload any part of the reasoning to some software other than these models. If you're unsure about whether something qualifies, please ask.

Resolves NO if no such prompting is found this year (2024).

LLMs

LLM prompts

Tetris

Get

1,000

to start trading!

🏅 Top traders

#	Name	Total profit
1		Ṁ674
2		Ṁ53
3		Ṁ29
4		Ṁ23
5		Ṁ17

People are also trading

Will GPT-5 be at least a tiny bit strategic at the "Numbers Game"?

77% chance

Which, if any, GPT-n will outperform AlphaGeometry merely via prompting, by 2030?

Will GPT-5 not be terrible at the "Numbers Game"?

88% chance

Will GPT-5 be able to solve A::B system puzzles consistently

78% chance

Will GPT-5 be able to draw me in tic-tac-toe while playing as O at least 30% of the time?

63% chance

Will GPT-5 have Atari skills?

25% chance

Will Gemini outperform GPT-4 at mathematical theorem-proving?

62% chance

Will "Gemini [Ultra, 1.0] smash GPT-4 by 5x"?

18% chance

Will someone beat Tetris level 255 by 2030 year end?

60% chance

GPT-4 with image recognition wins tictactoe more than half the time against a child level opponent?

Sort by:

does GPT-4o count for this market?

Nope.

Nice! Have you done some trials with random actions to see how far along the curve 30 (or 50) pieces would be? The videos aren't obviously better than that 😅