GPT-4 with image recognition wins tictactoe more than half the time against a child level opponent?

MANIFOLD

Ṁ270Ṁ533

Apr 1

17%

chance

ALL

The opponent is playing badly but attempting to win. ie neither random play nor perfect play.

Market context

— LLM & AI Capabilities—

Get

1,000

to start trading!

People are also trading

Will GPT-5 be able to solve A::B system puzzles consistently

14% chance

Will GPT-4.5 score at least 100 in an IQ test?

69% chance

Llama 5 outperforms GPT 4o on LM Arena?

85% chance

Was GPT-4 trained in 4 months or less?

59% chance

Is GPT4 sentient?

7% chance

Sort by:

If GPT makes an illegal move, will you try to elicit another one, or mark the game as lost? And same question, I suppose, for the child-level player.

How many times will you attempt it, so as to measure the success rate?

@firstuserhere What do you suggest?

predictedYES

@NathanpmYoung I'd recommend following chain of thought (asking it to solve the problem step by step for each move) (current LLMs think out loud) and other prompting techniques, along with at least 5 different matches (winning 3 or more of the matches are a W for gpt-4V)

Here's a section from this great paper on GPT-4 Vision's applications (https://arxiv.org/pdf/2309.17421.pdf) :

One observation about LLMs is that LLMs don’t want to succeed [9]. Rather, they want to imitate training sets with a spectrum of performance qualities. If the user wants to succeed in a task given to the model, the user should explicitly ask for it, which has proven useful in improving the performance of LLMs

I particularly recommend at least skimming through sections 3 and 4 and using similar wording.