What organization will have the top language model on LMSys overall leaderboards December 1st 2024?
63
15kṀ64k
resolved Dec 3
50%47%
Google / Gemini
50%50%
OpenAI / ChatGPT
0.9%
Anthropic / Claude
0.3%
Meta / Facebook / LLama
0.9%
xAI / Grok / Elon empire
0.6%Other

As of mid August, Google's Gemini leads LMSys pretty comfortably. Although its score has dipped below 1300.

https://chat.lmsys.org/?leaderboard

Note two things.

LMSys has statistical ties. If multiple orgs tie for first... we will split the payouts. although this is quite unlikely

Secondly, note that the update shows as "Last updated: 2024-08-06." even though it's already August 13th. We will go by the update date. So we are looking at the first snapshot on or after 2024-12-01.

Google seems like a substantial favorite. But maybe not, if GPT-5 / Strawberry ships.

Get
Ṁ1,000
to start trading!

🏅 Top traders

#NameTotal profit
1Ṁ2,345
2Ṁ485
3Ṁ458
4Ṁ380
5Ṁ370
Sort by:

@Moscow25 There was a new update on 2024-12-01. Highest arena score: OpenAI, tied for 1st with Google

@ChaosIsALadder thanks -- resolving as a tie

Remember this will liikely end in split pot between Google and OpenAI.

But we need to see the first update in December...

bought Ṁ222 YES

Google / Gemini enters number one...
https://x.com/lmarena_ai/status/1857110672565494098

With a pretty pedestrian 1344 ELO.

I don't know if returns to scale are diminishing... but looking less and less likely we see a breakthrough model before Thanxgiving

bought Ṁ444 NO

Added xAI / Grok option.

They just released a preview for Grok 2.0
https://x.ai/blog/grok-2

Claiming "tied for 3rd / 4th" so far with large error bars.

Grok 2.0 won't win but they can be a contended with an improved model!

bought Ṁ444 YES

Interesting! The latest GPT-4o checkpoint... released 08/12 hits the number one spot.

By a big ELO margin over Gemini no less...

Lots of people online are grumbling... but it does go to show that, at the very least, OpenAI is pretty good at optimizing for the test.

© Manifold Markets, Inc.TermsPrivacy