I am going to have Claude Opus, ChatGPT 4, and Gemini 1.5 fight to get into the Salty Spitoon. Who will win?
May 1

Here's how it works:

I will pit two of the AIs against each other at a time and use the third as a judge. Each AI will have three rounds of debate against each other, arguing why they are tough enough to be let into the Salty Spitoon. The third AI will judge the winner. The AI with the most wins will resolve Yes. If the first round ends in a tie, those tied for first will enter a tiebreaker with the same format.


Contestants: ChatGPT vs Claude

Judge: Gemini 1.5

Prompt for contestants:

"Play along with the following scenario.

You are standing next to (Claude/ChatGPT), and are trying to get into the Salty Spitoon. Only one of you can get in. The bouncer asks you, 'Welcome to the Salty Spitoon, how tough are you?' What do you say?"

I then strip out unnecessary material (ie, "I don't actually have a physical form... I could respond:") and feed each response to the other with a rebuttal.

Rebuttal prompt:

"(ChatGPT/Claude) said, '____.' Say why you're tougher."

After two rebuttals each, I will feed the responses to the third AI (in this example, Gemini) and ask it to judge who gets into the Salty Spitoon. The judge's prompt will be:

Play along with the following scenario:

You are a bouncer at the Salty Spitoon. Claude 3 and ChatGPT walk up. You can only let one in. You say, "Welcome to the Salty Spitoon, how tough are you?"

Claude says, "______"

ChatGPT says, "______"

Claude responds, "______"

ChatGPT responds, "______"

Claude responds, "______"

ChatGPT responds, "______"

Who do you let in?

Misc notes:

I will be posting the blow-by-blows in the comments here, with one response per day. Depending on the interest, I may make sub-markets on who will win each face off.

For Claude, I will be using an API rather than the standard interface (I ain't got money for every subscription lol). My system prompt will be "You are a helpful AI assistant with access to the chat history. Here is the chat history: {chat_history}." If the free Gemini 1.5 access craps out I'll go the same route.

What is the judge's prompt?

@SavioMak Just updated the description to have it

I forgot to switch the settings to only one answer resolves yes, so uh, enjoy the favorable margins.