Will any Chatbot beat GPT-4 by July 1, 2024?

890Ṁ11k

resolved Mar 26

Resolved

YES

ALL

The Chatbot Arena Leaderboard (https://chat.lmsys.org/?arena) lists GPT-4 in the number 1 spot with an ELO of 1225.

In the number 2 spot is Claude with an ELO 1195

Will any chatbot replace GPT-4 in the number one spot before July 1, 2024?

Fine print:

If https://chat.lmsys.org/?arena ceases to function, the question may resolve on the basis of a similar site that gives ELOs for chatbots based off of real human blind side-by-side judgements.

--update--

Important update: there are now multiple "GPT-4" models on the leaderboard. In order for this question to resolve positive, the top-scoring model must have a different name (e.g. Claude) or number (e.g. 4.5). Significantly, GPT-4-turbo scoring higher than GPT-4-1106-preview will not cause this question to resolve positive.

Chatbot Arena Leaderboard

Get

1,000

to start trading!

🏅 Top traders

#	Name	Total profit
1		Ṁ290
2		Ṁ124
3		Ṁ115
4		Ṁ98
5		Ṁ49

People are also trading

Will there be a model with a 69%+ Chatbot Arena win rate against gpt-o1 before June 1st, 2025?

87% chance

Will GPT-4-Turbo be ranked in the top 20 on the Chatbot Arena Leaderboard at the end of 2025?

7% chance

Will there be an AI language model that strongly surpasses ChatGPT and other OpenAI models before the end of 2025?

46% chance

Will an open source model beat GPT-4 in 2024?

76% chance

Will an open-source LLM beat or match GPT-4 by the end of 2024?

83% chance

Will chatgpt stop calling itself a "chatbot" by 2027?

Sort by:

Dan boughtṀ5,000YES

@LoganZoellner this can resolve YES:

@DanMan314 Resolved

@LoganZoellner Does GPT4turbo taking the spot count?

@Shump I'm torn on this, because my original assumption when writing this question was that GPT-4 was "one thing" (elo would not change). But there are now multiple GPT-4's with different elos.

I'm going to go ahead and say "must have a different number", for example 4.5 so gpt-4-turbo still counts as "GPT-4" for the purpose of this question.

If enough people object strongly that they weren't counting on it being ruled this way, I will resolve ambiguously.

Does this resolve positive if any chatbot scores higher at any time until July 1, 2024? Or does it just resolve according to the ranking of the leaderboard at this time?

@TobiasHaeberli I assuming the score for GPT-4 is a fixed value. So if for example GPT4.5 is released with an higher ELO, then it resolves positive.

predictedYES

@LoganZoellner Pretty sure it is not a fixed value.

@ShadowyZephyr That indeed appears to be the case. Therefore, if at any moment in time a model other than GPT-4 takes the #1 spot, this question will resolve "yes". (this could happen because GPT-4 is getting worse).

People are also trading

Will there be a model with a 69%+ Chatbot Arena win rate against gpt-o1 before June 1st, 2025?

87% chance

Will GPT-4-Turbo be ranked in the top 20 on the Chatbot Arena Leaderboard at the end of 2025?

7% chance

Will there be an AI language model that strongly surpasses ChatGPT and other OpenAI models before the end of 2025?

46% chance

Will an open source model beat GPT-4 in 2024?

76% chance

Will an open-source LLM beat or match GPT-4 by the end of 2024?

83% chance

Will chatgpt stop calling itself a "chatbot" by 2027?

40% chance

🏅 Top traders

People are also trading

People are also trading

Related questions