What organization will have the highest ELO score in the LMSYS Org Chatbot Arena Leaderboard at the end of March, 2024?

1.9kṀ45k

resolved Apr 1

100%99.4%

Anthropic

0.1%

Alphabet (Google)

0.0%

Meta (Facebook)

0.4%

OpenAI

0.0%

Mistral AI

0.0%Other

I'm referring to this Chatbot Arena Leaderboard.

Next quarter: /HankyUSA/who-will-own-the-model-at-the-top-o-5527f82db47f

Chatbot Arena Leaderboard

ELO Ratings

Get

1,000

to start trading!

🏅 Top traders

#	Name	Total profit
1		Ṁ2,052
2		Ṁ998
3		Ṁ923
4		Ṁ422
5		Ṁ392

People are also trading

Which companies will achieve a higher ELO rating than OpenAI on ChatBot Arena in 2025? ( >= week)

Who will ever rank #1 in LMSYS Chatbot Arena Leaderboard in 2025?

Who will ever rank Top 10 in LMSYS Chatbot Arena Leaderboard in 2025?

Will a chatbot from a Chinese company top the LMSYS leaderboard in 2025?

20% chance

Which LLM will have the highest ELO at the end of 2025 on ChatBot Arena?

Will a chatbot from a Chinese company top the LMSYS leaderboard in 2026?

69% chance

Which Companies will top Chatbot Arena Leaderboard in 2027?

Will the LMSYS Chatbot Arena still be 'a thing' in 2027, under the same evaluation method?

36% chance

Which Companies will top Chatbot Arena Leaderboard in 2028?

Which Companies will top Chatbot Arena Leaderboard in 2026?

11 Comments

44 Holders

334 Trades

Sort by:

bought Ṁ717 YES

I'm going to call it now. Let me know if you think I incorrectly resolved this question. I'm looking at the Chatbot Arena Leaderboard (last updated: March 29, 2024). Anthropic is at the top with Claude 3 Opus having an Arena Elo of 1255.

How does this resolve on ties? https://twitter.com/lmsysorg/status/1767997086954573938/photo/1

@JacobPfau I thought about weaseling this in, but I think "on top" should definitely account for ELO rather than the shared fiest place.

@JacobPfau I'll break ranking ties with ELO scores. If there's still a tie then I'll resolve them 50:50.

Would a “Bing Chat” model powered by an OAI model but with addition features (internet) or fine tuning still count as OAI?

@WillSorenson I might count that as partially Open AI and partially Microsoft but leaning towards Open AI. How likely do you think that is to be the model at the top of the leaderboard?

@HankyUSA Since Google has done it and it scores quite well, pretty good chance! Vanilla pro scores worse than gpt3.5 but pro with bard out scores better than 2 of the 3 gpt4s

Looks like the "Bard" version of Gemini is doing a lot better in the arena?

Seems like Gemini kinda sucks on chatbot arena. Pro is supposed to be significantly better than GPT-3.5 from Google's internal benchmarks, but it's actually a little bit worse than GPT-3.5 on chatbot arena. I wouldn't expect Ultra to top the chart.