Will Deepseek V4 outperform OpenAI and Anthropic models at coding?

Question

Claim: https://x.com/petergostev/status/2009616928763981963

Will Deepseek V4 outperform OpenAI's and Anthropic's strongest contemporary models at the time of its release?

Relevant coding benchmarks:

SWE-bench Verified

HumanEval

TerminalBench

RE-Bench

LiveCodeBench

Deepseek V4 must score higher than both OpenAI's and Anthropic's strongest latest released models on 3/5 of these benchmarks (official or independent benchmark results) to resolve YES. If V4 matches or underperforms either of its competitors on more than half of those benchmarks, it resolves NO. If a certain benchmark is not reported within 1 month of release, that benchmark counts as a loss for Deepseek V4.

Update 2026-04-25 (PST) (AI summary of creator comment): The creator intends to resolve this market NO, noting that RE-Bench and HumanEval are not consistently being reported for new frontier models, and that DeepSeek likely does not beat Opus 4.7 at coding.

Manifold Markets · Answer

Almost certainly not — Manifold Markets prediction market estimates a 2% chance (22 traders, as of May 9, 2026).

Claim: https://x.com/petergostev/status/2009616928763981963

People are also trading

Related questions