GPT-4 with image recognition wins tictactoe more than half the time against a child level opponent? | Manifold

GPT-4 with image recognition wins tictactoe more than half the time against a child level opponent?

13

270Ṁ533

Apr 1

17%

chance

1H

6H

1D

1W

1M

ALL

The opponent is playing badly but attempting to win. ie neither random play nor perfect play.

— LLM & AI Capabilities—

Get

1,000

to start trading!

People are also trading

Will GPT-5 be able to draw me in tic-tac-toe while playing as O at least 30% of the time?

Will the GPT4+code-interpreter+search score > 1350 on Lmsys Arena Leaderboard?

Will GPT-5 have Atari skills?

Will GPT-5 score higher than 1350 on the Lmsys Arena Leaderboard

Will GPT-5 not be terrible at the "Numbers Game"?

Will GPT-5 have a rating of at least 2000 in chess?

Will GPT-5 be able to solve A::B system puzzles consistently

Will GPT-4.5 score at least 100 in an IQ test?

Will GPT-5 be at least a tiny bit strategic at the "Numbers Game"?

Was GPT-4 trained in 4 months or less?

Sort by:

If GPT makes an illegal move, will you try to elicit another one, or mark the game as lost? And same question, I suppose, for the child-level player.

How many times will you attempt it, so as to measure the success rate?

@firstuserhere What do you suggest?

predictedYES

@NathanpmYoung I'd recommend following chain of thought (asking it to solve the problem step by step for each move) (current LLMs think out loud) and other prompting techniques, along with at least 5 different matches (winning 3 or more of the matches are a W for gpt-4V)

Here's a section from this great paper on GPT-4 Vision's applications (https://arxiv.org/pdf/2309.17421.pdf) :

One observation about LLMs is that LLMs don’t want to succeed [9]. Rather, they want to imitate training sets with a spectrum of performance qualities. If the user wants to succeed in a task given to the model, the user should explicitly ask for it, which has proven useful in improving the performance of LLMs

I particularly recommend at least skimming through sections 3 and 4 and using similar wording.

playing what

@NathanpmYoung this game? On a 3X3 grid?

People are also trading

Will GPT-5 be able to draw me in tic-tac-toe while playing as O at least 30% of the time?

Will the GPT4+code-interpreter+search score > 1350 on Lmsys Arena Leaderboard?

Will GPT-5 have Atari skills?

Will GPT-5 score higher than 1350 on the Lmsys Arena Leaderboard

Will GPT-5 not be terrible at the "Numbers Game"?

Will GPT-5 have a rating of at least 2000 in chess?

Will GPT-5 be able to solve A::B system puzzles consistently

Will GPT-4.5 score at least 100 in an IQ test?

Will GPT-5 be at least a tiny bit strategic at the "Numbers Game"?

Was GPT-4 trained in 4 months or less?

Related questions

Will GPT-5 be able to draw me in tic-tac-toe while playing as O at least 30% of the time?

Will the GPT4+code-interpreter+search score > 1350 on Lmsys Arena Leaderboard?

Will GPT-5 have Atari skills?

Will GPT-5 score higher than 1350 on the Lmsys Arena Leaderboard

Will GPT-5 not be terrible at the "Numbers Game"?

Will GPT-5 have a rating of at least 2000 in chess?

Will GPT-5 be able to solve A::B system puzzles consistently

Will GPT-4.5 score at least 100 in an IQ test?

Will GPT-5 be at least a tiny bit strategic at the "Numbers Game"?

Was GPT-4 trained in 4 months or less?

© Manifold Markets, Inc.•Terms + Mana-only Terms•Privacy•Rules