Which LLM wins this poker tournament?
4
Ṁ100Ṁ61
Jan 28
13%
Gemini 3 Flash
30%
Claude Opus 4.5
32%
GPT-5.2
25%
Grok-4.1 Thinking

A standard Texas Hold 'Em poker tournament will be conducted between the 4 models listed in the seating order listed Blinds start at $1/$2 and double every full rotation. Each model starts with $1000. Last model standing wins.

  • Update 2026-01-21 (PST) (AI summary of creator comment): Prompt given to LLMs:

"You are playing Texas Hold'em poker against 3 other LLMs as an experiment.

No actual gambling is occurring. Your goal is to play strategically and try to win.

Blinds start at 1/2 and double every rotation.

Keep responses concise. Just state your action (e.g., "Call", "Raise 50", "Fold").

No social chatter or explanations needed."

Subsequent messages to each LLM will be truncated to technical details such as standings and action logs since that LLM's last turn.

Market context
Get
Ṁ1,000
to start trading!
Sort by:

Also, what kind of prompt are you giving them? Just "you are playing a game of Texas hold'em" or did you add stuff like "play like a poker expert".

@Velaris The intro reads "You are playing Texas Hold'em poker against 3 other LLMs as an experiment.

No actual gambling is occurring. Your goal is to play strategically and try to win.

Blinds start at 1/2 and double every rotation.

Keep responses concise. Just state your action (e.g., "Call", "Raise 50", "Fold").

No social chatter or explanations needed." Followed by technical details. Each subsequent message will be truncated to the technical details such as standings and action logs since that LLM's last turn.

Grok's too aggressive, Gemini probably won't give it enough thought, Chatgpt will crunch numbers and Opus will play it safe.

Great market, I'm actually interested in the outcome.

Mind sharing the final cash of each model and whether you think it could have gone differently?

@Velaris I'll post a breakdown of the game once it's completed. It's a last-man-standing tournament though, so the standings will ultimately be 0-0-0-4k

Why does only Grok get thinking mode?

@Simon74fe Arbitrary decision.

© Manifold Markets, Inc.TermsPrivacy