Who will have the best LLM at the end of 2024 (as decided by ChatBot Arena)?
💎
Premium
803
Ṁ810k
resolved Dec 31
100%98.4%
Google
1.3%
OpenAI
0.0%
Anthropic
0.0%
Mistral
0.0%
Inflection
0.1%
xAI
0.0%
Meta
0.0%
Apple
0.0%
Cohere
0.0%
Microsoft
0.0%Other

I was browsing Twitter, and I saw a post by Karpathy positively talking about ChatBot Arena, which is a platform for ranking LLMs based on human ratings. As expected, OpenAI is holding positions 1, 2, and 3. I wonder which company will be #1 at the end of 2024.


Screenshot of the rankings table taken on the 13th of December:


Get
Ṁ1,000
and
S3.00
Sort by:

@traders Based on the comments below, I think it makes sense to resolve this question based on the ELO rating in case of a tie in "rank." When I created this question, a tie was not an option, so I doubt anyone even traded based on this assumption.

I created a similar question that only uses the rank. Feel free to trade on it.


@traders same market for 2025

😭

little fanfare for Google's great victory

buying other because it would be funny

https://x.com/deepseek_ai/status/1872242657348710721

@jim not quite, but so impressive

Big E just called, interesting

How come o1 isn't on the list on the chatbot arena?

Gemini flash 2.0 strawberry in the api

https://ai.google.dev/gemini-api/docs/thinking-mode

10k limit order @75% for anyone feeling brave

bought Ṁ500 YES

@WillSorenson it is slightly short of exp 1206. Are you assuming a thinking 1206 will be added?

@Usaar33 It appears more pleasant than o1 to me so it makes it unlikely o1 will top the charts. The following all have to go right for OAI to win:

1. They have to release a new model today
2. It has to actually be better in the dimensions that chatbot arena evaluates
3. Chatbot arena has to update it in time.

Possible! Not more than a 20% chance.

@WillSorenson

betting against openai

and

betting against elon

Brave.

bought Ṁ50 YES

@jim I’m also the second biggest xAi yes holder! Until December I was v bearish against google and thought the relative lack of censorship of grok would win out when chatbots were broadly good enough to answer most questions. I changed my mind when events turned against me

Google deepmind was and is severely underrated by this market. The odds are looking more reasonable now though

@AJama The rumor is that OpenAI will release GPT-4.5 soon.

@NeuralBets i would give it a 80/90% that OpenAI releases a new model as part of their 12 day of christmas but I am not sure they will make it available to LMSYS before end of the year - i am too deep at this point anyways so 🤷‍♂️

i am too deep at this point anyways so 🤷‍♂️

hah same 😅

opened a Ṁ10,000 YES at 40% order

@Bayesian right now this position represents ~80% of my mana net worth but i am doubling down and put a large limit order at 40% on openai

@JasonDavies @EliLifland FYI

@Soli it should be said that new model doesn’t mean that it will become N1. Reason 1: google may have fine tuned to perform way better on lmsys. Reason 2: google may have another fine tuned ready to answer any score release from OAI. Maybe google ceo and PM have their compensation tied to end-year perfomance on LMSYS

@mathvc true, openai released the new preview model over the api yesterday (which is still not ranked in LMSYS) and I expect another major announcement sooon so we shalll seee how it goes

Gemini 1206 is now top 1 model in all categories by a small margin yet people think OAI will be better at the end of the year (59% at the moment). Do people believe in new release? GPT4.5?

@mathvc I think it's more a question of how often the leader board is updated.

I agree with your stance, I just don't know if I want more exposure to this market with my novice level of understanding of the subject.

is openai planning on doing another update before end of year? they used to be like every 2 weeks, but google lately has started that schedule

@NoahRich gpt 4.5 will be released this year

@Soli it is not announced anywhere officially

bought Ṁ50 YES

@mathvc i know, its a prediction

Comment hidden
© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules