Currently, seems like gpt4 is awful at tic tac toe. I can't get it to play sensibly at all. It won't strategize even one step ahead. Resolves YES if there is a prompt that allows it to win or draw more than 70% of the time against humans that are not purposefully trying to lose.
A "prompt" should be like this.
You can start by sending any message you like, instructing GPT4 about how to behave, how the board should be formatted etc, and you making the first move.
Then GPT4 and you should alternate making moves in a single, standardized, pre-determined format until the game is complete.
Edit: I don't know if this is a concern or if anyone was planning to do this, but this should not be done by some form of naive exhaustion like listing out combinations of starting moves, and prescribing singular response moves that guarantee a draw.
Edit2: It should work irregardless of whether the human or GPT4 starts. Having two prompts for whether the human or GPT starts is fine.
Edit3: To have a more concrete resolution criteria, I will evaluate potential solutions like this: I'll attempt to play tic tac toe against it 10 times using exactly your prompt (initial prompt + response prompts + my move (possibly formatted in some way required by the structure of hte prompt)). If it beats me or draws more than 5 of the times, I'll play tic tac toe against it 50 times, and if it wins/draws more than 35 of the times, this resolves to YES. Might lower this to 20 - more than 14 if this turns out to take way too much time.
Related questions
🏅 Top traders
# | Name | Total profit |
---|---|---|
1 | Ṁ355 | |
2 | Ṁ237 | |
3 | Ṁ167 | |
4 | Ṁ142 | |
5 | Ṁ80 |
What is the winning prompt? I beat the one in the comments easily:
https://chat.openai.com/share/4cc632b0-d5aa-410e-a766-9d5b1784facf
@bjubes Several of them were good enough including the one supplied by Buyukliev that they could only be beat using specific strategies.. Resolved to YES as that was the most literal interpretation of the market.. Not really happy about it though, I should've been more specific in the initial description of the market. But the strategy employed in those prompts could clearly be extended to perfect play, so if I didn't resolve it YES using the literal interpretation, It'd've certainly resolved YES anyways just a little bit later.
@bjubes This was my current latest attempt (iterated heavily from Buyukliev and ShadowyZephyr prompts)! It isn't perfect but I haven't beaten it yet.
https://chat.openai.com/share/fd0edbfc-b264-47f0-9d99-21403878101b
https://chat.openai.com/share/f58409a6-6754-4c0c-895e-39ef2c4f0ace
(Notes: (1) This is a "moving second" prompt. But moving first is an easier prompt. (2) The structure of the prompt was set up to work with GPT4 in Playground, which is why you have the AI/BOB messages. But it seems reliable in ChatGPT too!)
@wustep Doesn't it still lose to the strat I posted earlier? Sorry if I'm not using the prompt correctly.
https://chat.openai.com/share/b0a9bed7-e639-4943-9b28-f004cb425a0a
@hmys Hmm.. that strat seems to often but not always gets stopped. That step (8/9) - finding double block / attack moves could be refined a bit so it always tries to say the blocking / attack moves instead of just sometimes.
I think it tested better in Playground over ChatGPT though (0 temperature). Maybe ChatGPT has extra instructions to be less verbose and is generally more random. 😛
v3 of the prompt https://chat.openai.com/share/631d1852-7768-40fb-bea5-5f19dc13c982
you shouldn't be able to win games against it now
@PeterBuyukliev This sorta feels like it might be a set of instructions that cover every possible case.
Which is fine/cool but I want to contest w HMYS that this matches the resolution conditions
Step 8) Finally, if you cannot set up a guaranteed winning move, and you don't have to block the opponent, try to set up A threat for the next move. I.e. get two squares from a triplet, where the final space is empty. Important - first check if you can make a threat using the edge squares (2, 4, 6, 8), and ONLY if you can't, make a threat using the corner squares (1, 3, 5, 7).
@NoaNabeshima why? this is basically "edge squares are better than corner squares after the first turn"
@PeterBuyukliev I don't think this part of the prompt covers "every possible case" -- it just basically says "prefer edges over corners when making threats".
My read was @hmys was thinking more of the lines of things like: "On turn 1, If they play 1, I will play 5. On turn 2, if they play 3 or 5, I will play 6" but there's some major ambiguity here.
I was messing with a step much more sus than anything in Peter's, which works sometimes, but is more likely to be considered disallowed 😛
"Step 4) If it is not turn 2, skip to step 5 immediately. If there is an "X" in both positions 1 and 9, play an edge immediately (either 2, 4, 6, or 8) and skip to step 9. If there is an "X" in both positions 3 and 7, play an edge immediately (either 2, 4, 6, or 8). Otherwise, proceed to the step 5."
@PeterBuyukliev https://chat.openai.com/share/c62c57e0-4fd8-4544-8440-92a0cb945b59 not quite there yet! 😛
@wustep I interpreted that as writing a tree of moves and then saying 'if this do this' for every scenario. That is clearly not happening here, so I think the prompt should be allowed.
sorry, that's what I mean -- chatGPT using gpt4 for that transcript I linked. that's what Peter used as well.
Here's some additional iterations which appear more reliable so far: https://github.com/wustep/ai-explorations/blob/main/tic-tac-toe/outputs-1.md -- I'm testing using nat.dev Playground with GPT4.
edit: Seems step 4 (opposite corner detection). appears unreliable sometimes and needs iteration, but this seems more reliable with making blocking & winning moves due to the extra redundancy.
@wustep Huh - why is the logo not black/purple then? Maybe it's because I'm on gpt-3.5? Anyway, I trried that scenario against GPT-4 on v2 prompt and it didn't fail. lert me try with this prompt. I was on low temp though
Can you make a pastebin of the best prompt you have? I think it might be good enough but we can probably do better
not really sure 🙃. but I think you have to give any prompt a few tries (even with 0 temperature) and check a few different edge cases. anecdotally, chat is less accurate than playground.
I'm using 0 temperature now. Current prompt is: https://raw.githubusercontent.com/wustep/ai-explorations/0219cc5066a26babc66bfc80d9ca435cae9b477e/tic-tac-toe/prompts-1.md but still need to test the latest corner detection changes more, but I think your "prefer edge over corner" in last step might be fine compared to what I'm doing? not sure.
edit: Current best with opposite corner detection is still flaky, so here's just a "prefer edge over corner" strategy lol, but I'm done for the day.
@PeterBuyukliev It still loses to the same strategy if you don't place your first X in the bottom right.
https://chat.openai.com/share/f60b6e83-60a9-4978-a593-86a439d6786a
@hmys You can also beat it quite easily using some other strategy using the same principle of setting up two winning lines. Like this
https://chat.openai.com/share/0f8ca943-adb3-45e4-8cb1-007af7fd1863
This also works against @PeterBuyukliev s prompt
@hmys Nice -- the setup 2 winning lines via 5 & 8 works still even with my 3 steps with the "prefer edge over corner" prompt! I'm done working on this for the weekend, but I think we're getting closer and closer.
@hmys I'd like to remind you that the target was "70% draws or wins" and not "absolutely never ever loses". I think it's time to resolve this market. Yes, you can probably figure out a way to coax it to make a mistake. If you wanted a perfect play, you should have opened a market for a perfect play.