According to a majority of the benchmarks in the next Claude Sonnet's system card that 4.5 Opus was also evaluated on Resolves NO if it's better on exactly 50% of benchmarks N/As if Anthropic does not release another Claude Sonnet model by EOY 2027 Update 2026-02-17 (PST) (AI summary of creator comment): The creator will go through every benchmark in the system card (not just software engineering benchmarks), classify each as "software engineering" or "not software engineering", and resolve based on the majority of software engineering benchmarks where Sonnet performs better than Opus 4.5.

No — resolved on Feb 17, 2026 by Manifold Markets prediction market.

Will the next Claude Sonnet be better than Claude 4.5 Opus at software engineering?

Ṁ1kṀ13k

resolved Feb 17

Resolved

ALL

According to a majority of the benchmarks in the next Claude Sonnet's system card that 4.5 Opus was also evaluated on

Resolves NO if it's better on exactly 50% of benchmarks

N/As if Anthropic does not release another Claude Sonnet model by EOY 2027

Update 2026-02-17 (PST) (AI summary of creator comment): The creator will go through every benchmark in the system card (not just software engineering benchmarks), classify each as "software engineering" or "not software engineering", and resolve based on the majority of software engineering benchmarks where Sonnet performs better than Opus 4.5.

Market context

Get

1,000

to start trading!

🏅 Top traders

#	Trader	Total profit
1		Ṁ1,068
2		Ṁ373
3		Ṁ212
4		Ṁ127
5		Ṁ85

People are also trading

Will Anthropic release Claude Sonnet 5 and Claude Opus 5 on the same day?

38% chance

Was sonnet 5 meant to be released instead of Opus 4.6?

3% chance

Will Claude Haiku 4.6 be released? (before Claude 4.7/5/etc)

21% chance

Will Claude Sonnet 5 exceed 85% on SWE-bench verified?

Sort by:

@creator Which benchmarks count as "software engineering"?

@Simon74fe Based on Introducing Sonnet 4.6 \ Anthropic Sonnet 4.6 is worse on SWE-bench Verified, and is clearly positioned as a worse but cheaper model that "approaches Opus-level intelligence at a price point that makes it more practical," although I have no idea what benchmarks the creator intends either. I would trade lower but it's a bit too risky for me(10k net worth), if anybody wants to give me exit liquidity at 25% I'll take it though (Unless the creator specifies what the market intends to resolve to).

@Dssc I was going to go one by one through every benchmark in the system card, classify them as "software engineering" or "not software engineering", and then resolve based on the majority. But at a glance I don't see a single SWE benchmark that Sonnet beats Opus 4.5 on in the card at all...

bought Ṁ7,549 NO

@SaviorofPlant yeah after going through all of them, sonnet wins on 0%

@SaviorofPlant It does win on OpenRCA and CyberGym (but still clearly loses overall if you only consider SWE)

People are also trading

Will Anthropic release Claude Sonnet 5 and Claude Opus 5 on the same day?

-13% 1d38% chance

Was sonnet 5 meant to be released instead of Opus 4.6?

3% chance

Will Claude Haiku 4.6 be released? (before Claude 4.7/5/etc)

21% chance

Will Claude Sonnet 5 exceed 85% on SWE-bench verified?

47% chance

🏅 Top traders

People are also trading

People are also trading

Related questions