When will an LLM enter and maintain 1360 ELO on LMSys for 10,000 votes? OpenAI? Gemini? Anthropic?
13
10kṀ79k
resolved Nov 30
100%95%
By end of November (after October)
0.9%
By end of October
3%
By end of December 2024 (after November)
1.5%Other

LMSys pulled a fast one on us last time.

OpenAI's O1-preview entered the rankings at 1355 ELO, then got voted all the way down to 1335 now.

https://lmarena.ai/?leaderboard


This seems fishy perhaps, but those are the breaks.

Therefore, this time we will look for a model that enters and maintains a ranking of 1360 with 10,000 votes. Instead of looking at the first public checkpoint like we had before, we will resolve this once a model is at 1360+ ELO and at 10,000+ votes.

One caveat is we will look when the model enters the arena, in its first public posting. But resolve only if that model reaches the ELO and votes requirement.

SO, if a model enters the arena (first shows up on leaderboards on October 20th) -- but doesn't get 10,000 votes until November that will still count as October.

Sorry it's confusing but this is more intuitive. We don't want to bet on how long 10,000 votes take. But on whether a good model entered the arena and will eventually meet the requirements.

In other words, we are betting on... when will we get a release that's noticeably better than today's ~1340 ELO models. According to the LMSys voters.

Sorry we need more votes now as the confidence intervals at 3,000 votes appear not to be reliable. That or people tried to downvote O1-preview. Who knows.

Get
Ṁ1,000
to start trading!

🏅 Top traders

#NameTotal profit
1Ṁ2,437
2Ṁ2,139
3Ṁ1,921
4Ṁ572
5Ṁ219
© Manifold Markets, Inc.TermsPrivacy