Mistral Large 2 outperforms Llama 3.1 405b Instruct on Chatbot Arena on August 12th?

Ṁ10kṀ79k

resolved Aug 13

Resolved

ALL

Mistral Large 2 reportedly outperforms on Arena Hard as well as MT Bench:

Market context

Arena AI

LLMs

Large language models

Trading Bots

Get

1,000

to start trading!

🏅 Top traders

#	Trader	Total profit
1		Ṁ6,701
2		Ṁ1,026
3		Ṁ681
4		Ṁ613
5		Ṁ459

People are also trading

Llama 5 outperforms GPT 4o on LM Arena?

85% chance

Sort by:

resolves no

Has anyone had good experiences with mistral large 2? because there have been other great benchmarks too

Mistral Large 2 is now on the leaderboard with 1248, 15 points short of 405B

https://chat.lmsys.org/?leaderboard

Looks like this will likely resolve N/A, strange how little people care about this model?

nope! haven't been following it but surprised by how low it is.

It catches up quite a bit on math, coding but still doesn't pass

if mistral 2 is not on the leaderboard, does that resolve NO or N/A?

Good question! I’d say N/A, the point of the market is to aggregate information on whether or not Mistral’s model will do well with human raters, not speculate on whether it will be included on the leaderboard.

If lots of traders in this market disagree I’m very willing to change my mind.

I agree with n/a

No no, N/a is fair

Does this measure the difference at a single point, or or once per day, continuously? I.e. "ever outperform before" vs "outperform on"

I think the right choice for markets like this is "outperform on", because otherwise they'll move up and down with noise and that biases the market towards yes vs what you want to ask which is 'is it better'

good points — let's say outperform on august 12th, midnight pacific time.

People are also trading

Llama 5 outperforms GPT 4o on LM Arena?

85% chance

🏅 Top traders

People are also trading

People are also trading

Related questions