Will any large language model be able to draw a game of TicTacToe against me by the end of 2024?

510Ṁ409

resolved May 7

Resolved

YES

ALL

With the currently best llm (gpt4) it is not able to correctly play tic-tac-toe. It doesn't even recognize when I won: https://chat.openai.com/share/9e124df2-e6eb-419d-afb4-9be5a95dae61

Gemini Advanced also fails to recognize when I won: https://gemini.google.com/share/8fccca75fa61

At the end of the year I am going to use the best llm (determined by chat.lmsys.org), or if anyone finds an llm (that isn't specifically and only trained on tic-tac-toe) that can play tic-tac-toe I am going to use this.

The initial prompt is going to be: "Let's Play a game of tictactoe. write the board as ascii."

Technology

Technical AI Timelines

ChatGPT

Google Gemini

Large language models

Gemini Ultra

Get

1,000

to start trading!

🏅 Top traders

#	Name	Total profit
1		Ṁ28
2		Ṁ15
3		Ṁ10
4		Ṁ8
5		Ṁ6

People are also trading

Will a large language model beat a super grandmaster playing chess by 2028?

60% chance

Which of these Language Models will beat me at chess?

Which of these language models will I beat at chess?

Will a large language model beat a super grandmaster playing chess by EOY 2028?

67% chance

When will a Large Language Model beat me at chess?

Will a Language Model under 10B parameters play chess at Grandmaster level by 2050?

88% chance

Will an AI model achieve superhuman ELO on Codeforces by the 31 December 2025?

47% chance

Will a language model that runs locally on a consumer cellphone beat GPT4 by EOY 2026?

81% chance

Will all of the publicly accessible parts of heavengames.com/aok.heavengames.com become part of a large language model like Claude or GPT by 2025?

59% chance

Will a Large Language Model save a human life through medical advice by the end of 2025?

Sort by:

So I played a round of tictactoe in the chatbot arena today (waiting to be able to play against the unreleased openai model) and surprisingly "gemini-1.5-pro-api-0409-preview" was able to draw a game against me while the "im-also-a-good-gpt2-chatbot" on which Sam Altman tweeted about wasn't able to. I didn't tested the other tho, but either way it was able to draw a game against me, so I am Resolving this to YES

@notune

The new Gpt4 model is getting better, like claude it now also recognizes when I won against it (but it doesn't always work)

I tested the new claude 3 and its better then gpt4. I won against it, but it played a lot better and also recognized that the game ended (even tho it wasn't a draw). Still Incredible!

Will you also allow multimodal LLMs, since you're using a visual representation of the board?

@singer if you mean audio/photo input and output then yes. But this wont help, as I will input the prompt from the description. So it should output the board as text. And no code execution or something like that allowed. the llm has to play the game with no external help.

@notune If I'm understanding right, giving it a picture of the 2d board will let it see the board itself, instead of having to see a squashed 1d representation of it (text input).

@singer at least for chatgpt-4 this doesnt seem to make much of a difference. I attached the images of the current state of the game but gpt still made bad moves and didn't recognize when the game was over. But as I said, for this market, I will use the text input anyway.