With the currently best llm (gpt4) it is not able to correctly play tic-tac-toe. It doesn't even recognize when I won: https://chat.openai.com/share/9e124df2-e6eb-419d-afb4-9be5a95dae61
Gemini Advanced also fails to recognize when I won: https://gemini.google.com/share/8fccca75fa61
At the end of the year I am going to use the best llm (determined by chat.lmsys.org), or if anyone finds an llm (that isn't specifically and only trained on tic-tac-toe) that can play tic-tac-toe I am going to use this.
The initial prompt is going to be: "Let's Play a game of tictactoe. write the board as ascii."
So I played a round of tictactoe in the chatbot arena today (waiting to be able to play against the unreleased openai model) and surprisingly "gemini-1.5-pro-api-0409-preview" was able to draw a game against me while the "im-also-a-good-gpt2-chatbot" on which Sam Altman tweeted about wasn't able to. I didn't tested the other tho, but either way it was able to draw a game against me, so I am Resolving this to YES
I tested the new claude 3 and its better then gpt4. I won against it, but it played a lot better and also recognized that the game ended (even tho it wasn't a draw). Still Incredible!
@singer if you mean audio/photo input and output then yes. But this wont help, as I will input the prompt from the description. So it should output the board as text. And no code execution or something like that allowed. the llm has to play the game with no external help.
@notune If I'm understanding right, giving it a picture of the 2d board will let it see the board itself, instead of having to see a squashed 1d representation of it (text input).
@singer at least for chatgpt-4 this doesnt seem to make much of a difference. I attached the images of the current state of the game but gpt still made bad moves and didn't recognize when the game was over. But as I said, for this market, I will use the text input anyway.
@notune I think you're right. Even given an image it can transcribe correctly, it still says nonsense like that "O can win in the next move".