
OpenAI released its o1 model to much fanfare.
https://deepnewz.com/ai/openai-unveils-o1-ai-model-advanced-reasoning-fact-checking-phd-level

LMSys has already announced that these models will be scored on LMSys and will soon appear on leaderboards

The current LMSys leaderboard is headed by GPT-4o-08-08, followed by Gemini and Grok.
https://lmarena.ai/?leaderboard

Will OpenAI's o1 get to #1 on this leaderboard by October 1st?
Several caveats since LMSys is weird...
We will look whatever is posted on October 1st
If an update happens that day, we will count it [so resolves October 2nd]
We use Eastern Time not "updated on" time on LMSys site -- which will often be 7+ days behind....
We will use any OpenAI o1 style model and take the best result
This will probably be "o1-preview" but if they post a better model that will also count
If no o1 model is released by October 1st we will wait until one is posted and extend the market.
As usual, statistical ties count! The market is "will o1 (or any best OpenAI model) be first or tied for first on LMSys?
But in most scenarios we will resolve this on October 2nd.
We also have a market betting on the model's ELO.
https://manifold.markets/Moscow25/what-elo-will-openais-o1-model-get
๐ Top traders
# | Name | Total profit |
---|---|---|
1 | แน2,290 | |
2 | แน1,540 | |
3 | แน747 | |
4 | แน396 | |
5 | แน372 |