https://simple-bench.com/ Claude 3.5 Sonnet 10/22 achieves 41.4% whereas the best Gemini model scores 27.1% [link preview]Update 2025-22-01 (PST): - Resolution Date: The market will now be resolved on February 1st, 2025 instead of the previously stated date. (AI summary of creator comment)

No — resolved on Feb 5, 2025 by Manifold Markets prediction market.

MANIFOLD

Before February 2025, will a Gemini model exceed Claude 3.5 Sonnet 10/22's Global Average score on Simple Bench?

Ṁ100Ṁ522

resolved Feb 5

Resolved

ALL

https://simple-bench.com/ Claude 3.5 Sonnet 10/22 achieves 41.4% whereas the best Gemini model scores 27.1%

SimpleBench

Update 2025-22-01 (PST): - Resolution Date: The market will now be resolved on February 1st, 2025 instead of the previously stated date. (AI summary of creator comment)

Market context

New Year's Resolutions 2025

Get

1,000

to start trading!

🏅 Top traders

#	Trader	Total profit
1		Ṁ116
2		Ṁ46
3		Ṁ27

People are also trading

Gemini perfect score on IMO 2026?

55% chance

Will Anthropic’s next Sonnet model exceed 65% on terminal bench?

3% chance

Will Claude Sonnet 5 exceed 85% on SWE-bench verified?

94% chance

Sort by:

@JaundicedBaboon Time to resolve? It's already February everywhere.

bought Ṁ50 YES

Can we go ahead and resolve this one?

@rogs Won't resolve it until February 1st

bought Ṁ5 NO

Now I'm looking at my comment above and wondering what I was thinking. Did I think this was a post about OpenAI models vs Claude rather than about Gemini vs Claude? Why did I think it was resolvable already?

People are also trading

Gemini perfect score on IMO 2026?

+8% 1d55% chance

Will Anthropic’s next Sonnet model exceed 65% on terminal bench?

3% chance

Will Claude Sonnet 5 exceed 85% on SWE-bench verified?

94% chance

🏅 Top traders

People are also trading

People are also trading

Related questions