What ELO will OpenAI's o1 / ๐Ÿ“ model get on LMSys?
58
10kแน€100k
resolved Oct 2
100%98%
1336 to 1355 (significantly ahead)
0.4%
1315 or below (at current best models)
1.4%
1316 to 1335 (slightly above current models)
0.5%
1356 to 1375 (far ahead)
0.2%
1376+ (very far ahead)

The OpenAI o1 / Strawberry model finally launched, to much praise and fanfare.
https://deepnewz.com/ai/openai-unveils-o1-ai-model-advanced-reasoning-fact-checking-phd-level

LMSys has announced the model is live for ranking, and will show up in the leaderboards soon.
https://x.com/lmsysorg/status/1834397197754118236

Currently the top model on LMSys is GPT-4o with 1316 ELO. Second place is Gemini with 1300 ELO and Grok is in a statistical tie with 1294 ELO.

https://lmarena.ai/?leaderboard

What will OpenAI's o1 ELO be -- on October 1st?

A few caveats since LMSys markets are tricky on the details

  • we will take the best OpenAI o1 model [or any other name from OpenAI]

  • we will adjudicate this on October 1st -- whatever checkpoint is live then [not the "last updated" date shown]

  • OR we will wait until the first checkpoint including the OpenAI o1 model if none is released by October 1st

  • if there are multiple updates on October 1st we will take the last one...

  • basically we will wait until the end of October 1st (early October 2nd) and judge based on whatever was posted by then -- unless there is no release yet in which case we will extend the market


Most likely this answer will be known before the Oct 1st release. Although the ELO score can change from more votes, after the initial posting.

We will use whatever number is posted. We will ignore the +- error bars.

Now go vote on LMSys :-)

Get
แน€1,000
to start trading!

๐Ÿ… Top traders

#NameTotal profit
1แน€2,797
2แน€2,699
3แน€749
4แน€744
5แน€603
ยฉ Manifold Markets, Inc.โ€ขTermsโ€ขPrivacy