Where will Anthropic's Claude 3.5 Sonnet model rank on LMSys Chatbot Arena on July 7th? Ahead of GPT, Gemini?
111
3kṀ56k
resolved Jul 8
100%98.7%
Rank #2 - #5 overall
1.0%
Rank #1 overall
0.1%
Rank #7 - #12 overall
0.1%
Below #12 but ranked
0.1%
Not ranked / not submitted / disqualified

On June 20th, Anthropic's Claude 3.5 Sonnet hit the internet to great applause. Immediately hailed as better and cheaper than Anthropic's Claude 3 Opus.

https://deepnewz.com/ai/anthropic-launches-claude-3-5-sonnet-2x-faster-80-cheaper-outperforming-gpt-4o

As of June 21st, the model is not yet listed on LMSys leaderboards, nor is it available for user voting. But presumably that will change shortly.

The model has crushed the leaderboards on LiveBench -- surpassing all models including GPT-4o by a large margin, as well as Anthropic's own Claude 3 Opus.

https://livebench.ai/

Here are the current leaderboards on LMSys Chatbot Arena -- overall standing.
https://chat.lmsys.org/?leaderboard

The top ranking model on June 21st is GPT-4o, with three Gemini models closely behind, followed by older GPT-4 models and Claude 3 Opus. Other models including those from 01 AI and Nvidia round out the top eleven.

Note that LMSys declares ties for models that are within error bars of each other. Hence the ties for second, fourth, sixth and eleventh that you see above.

The market will resolve at whatever ranks are listed on LMSys on July 7th (earliest snapshot for July 7th according to the LMSys website time). Ties will count. Hence if the model is tied for first place that will count as Rank #1.

If the model is never posted or is not included by the LMSys operators by July 7th, the market will resolve to the "Not Ranked" option.

Get
Ṁ1,000
to start trading!

🏅 Top traders

#NameTotal profit
1Ṁ2,358
2Ṁ1,407
3Ṁ683
4Ṁ511
5Ṁ453
© Manifold Markets, Inc.TermsPrivacy