GPT-4 with image recognition wins tictactoe more than half the time against a child level opponent?
Basic
13
Ṁ533
Apr 1
17%
chance

The opponent is playing badly but attempting to win. ie neither random play nor perfect play.

Get
Ṁ1,000
and
S3.00
Sort by:

If GPT makes an illegal move, will you try to elicit another one, or mark the game as lost? And same question, I suppose, for the child-level player.

How many times will you attempt it, so as to measure the success rate?

@firstuserhere What do you suggest?

predictedYES

@NathanpmYoung I'd recommend following chain of thought (asking it to solve the problem step by step for each move) (current LLMs think out loud) and other prompting techniques, along with at least 5 different matches (winning 3 or more of the matches are a W for gpt-4V)

Here's a section from this great paper on GPT-4 Vision's applications (https://arxiv.org/pdf/2309.17421.pdf) :

One observation about LLMs is that LLMs don’t want to succeed [9]. Rather, they want to imitate training sets with a spectrum of performance qualities. If the user wants to succeed in a task given to the model, the user should explicitly ask for it, which has proven useful in improving the performance of LLMs

I particularly recommend at least skimming through sections 3 and 4 and using similar wording.

playing what

@jskf soz

@NathanpmYoung this game? On a 3X3 grid?

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules