What organization will have the highest ELO score in the LMSYS Org Chatbot Arena Leaderboard at the end of June, 2024?
Basic
96
56k
resolved Jul 1
Alphabet launches Gemini Pro 1.5.
Anthropic releases Claude 3.
+8%
on
100%97%
OpenAI
0.4%
Alphabet (Google)
2%
Anthropic
0.0%
Mistral
0.0%
Meta
0.0%
Cohere
0.1%
Apple
0.1%
SenseTime
0.1%Other

I'm referring to this Chatbot Arena Leaderboard.

I'm planning on resolving based on who has the highest ELO score. This is because ELO scores was all that existed when I created this question.

Related:

Get Ṁ600 play money

🏅 Top traders

#NameTotal profit
1Ṁ1,554
2Ṁ300
3Ṁ273
4Ṁ244
5Ṁ231
Sort by:

How does this resolve if there's an unknown Chatbot at the top? (Eg when gpt 2 bot was of unknown origin)

@wrhall I hadn't considered that. I think it would be best to wait for the organization to be revealed. If that doesn't happen in an acceptable amount of time, then I'd probably resolve to Other.

How does this resolve in the case of a tie (like currently) where the top two are within the leaderboard's margin of error and are both ranked "1"?

@benshindel Critical question that needs to be answered

@benshindel Same as last quarter, I'll break ranking ties with ELO scores. If there's still a tie then I'll resolve them 50:50.

This is because ELO scores was all that existed when I created this question.

@HankyUSA Right, but that still doesn’t answer the question. The Elo scores give a margin of error, which is why currently if you look to the left you’ll see the ranks for the top 3 models are all “1”

@benshindel I think I answered your question. I just didn't give you the answer you wanted. I'm sorry, but I feel like I should keep as close to the original meaning of the market question as I can. When I created the market question there were no ranking numbers, just ELO scores.

This is probably going to keep coming up, so I'll add some clarifications to my existing market questions. I'll also create new market questions that are explicitly about the ranking numbers. When I do, I'll let you know.

@benshindel Here's the rank # version.