When will an LLM enter and maintain 1360 ELO on LMSys for 10,000 votes? OpenAI? Gemini? Anthropic?
💎
Premium
13
Ṁ70k
Jan 16
1.2%
By end of October
90%
By end of November (after October)
6%
By end of December 2024 (after November)
3%
Other

LMSys pulled a fast one on us last time.

OpenAI's O1-preview entered the rankings at 1355 ELO, then got voted all the way down to 1335 now.

https://lmarena.ai/?leaderboard


This seems fishy perhaps, but those are the breaks.

Therefore, this time we will look for a model that enters and maintains a ranking of 1360 with 10,000 votes. Instead of looking at the first public checkpoint like we had before, we will resolve this once a model is at 1360+ ELO and at 10,000+ votes.

One caveat is we will look when the model enters the arena, in its first public posting. But resolve only if that model reaches the ELO and votes requirement.

SO, if a model enters the arena (first shows up on leaderboards on October 20th) -- but doesn't get 10,000 votes until November that will still count as October.

Sorry it's confusing but this is more intuitive. We don't want to bet on how long 10,000 votes take. But on whether a good model entered the arena and will eventually meet the requirements.

In other words, we are betting on... when will we get a release that's noticeably better than today's ~1340 ELO models. According to the LMSys voters.

Sorry we need more votes now as the confidence intervals at 3,000 votes appear not to be reliable. That or people tried to downvote O1-preview. Who knows.

Get
Ṁ1,000
and
S3.00
Sort by:
sold Ṁ237 By end of November (... NO

Gemini is back on top now, and at 1365 this is very likely now.

@TimothyJohnson5c16 That’s right.

What a battle between the two AI model fighters 🥊

I think this is better than 50% to clear in November.. maybe even 60%+ but not 100% clear yet.

bought Ṁ555 By end of November (... YES

It's gonna be super close.
https://x.com/lmarena_ai/status/1859318401165930648

Latest GPT-4o is at 1361 not not over 10,000 votes yet

To be clear if it's at 1360 after 10,000+ votes update (first one) this will still count even if the update is after Nov 30th -- which it won't be anyway.

sold Ṁ527 By end of November (... YES

It's gonna be a close one! The latest 4o model has an arena score of 1361 with 8513 votes.

bought Ṁ555 Other NO

@ChrisPrichard Yes -- will be close

To be clear -- it's in the rules -- this model will count... when it gets to 10,000 votes. Even if they publish the 10,000 votes update after December 1st.

The model simply needs to appear in leaderboards by November. Which it has. This had to be done because of previous nonsense with O1.

But yes... one ELO point. Good lord.

bought Ṁ222 Other YES

Google's Gemini takes over top spot...
https://x.com/lmarena_ai/status/1857110672565494098

With rather pedestrian 1344 ELO.

Sad.

Fingers crossed for new 3.5 sonnet on the board before Nov!

@ChrisPrichard Yep that’s the only way

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules