Did DeepSeek lie about the GPU compute budget they used in the training of v3?

10kṀ53k

Jan 1

chance

ALL

Recently there has been a debate about how many GPUs DeepSeek uses in the training of its language models. The DeepSeek-v3 paper claims that only 2048 NVIDIA H800s were used[1], but others claim that they might have had as many as 50,000 H100s[2] (note: the H100 is the default GPU, the H800 is a gimped version of the H100 to comply with export controls).

The market will resolve NO if either:

DeepSeek-v3's performance is successfully replicated using no more than 2x the claimed compute budget
There is insufficient evidence to conclude that DeepSeek misrepresented their compute usage at market close (default NO)

The market will resolve YES if there is widespread agreement in the AI community at market close that DeepSeek used significantly more compute resources than claimed in their technical report.

I will not bet in this market.

[1] DeepSeek-V3 Technical Report
https://arxiv.org/abs/2412.19437

[2] CEO of Scale AI claiming DeepSeek has access to 50,000 H100s

https://youtu.be/x9Ekl9Izd38?si=yqstFkBxP9ICnxf_&t=170

Update 2025-27-01 (PST) (AI summary of creator comment): Clarification on "used":
- "Used" refers exclusively to the main training run of DeepSeek-v3
- It includes the number of concurrent GPUs employed during the main training process

Technology

China

Artificial Intelligence

DeepSeek

Get

1,000

to start trading!

People are also trading

Did DeepSeek receive unannounced assistance from OpenAI in the creation of their v3 model?

8% chance

Did DeepSeek violate OpenAI's terms of service by using OpenAI model outputs for distillation in 2024 or January 2025?

9% chance

Did DeepSeek use a cluster of more than 5,000 H100s in 2024?

30% chance

Will OpenAI’s claims that DeepSeek is a distillation of their models become the consensus view?

17% chance

Will there be an open replication of DeepSeek v3 for <$10m?

Sort by:

filled a Ṁ30 NO at 16% order

Deepseek is a quant firm right? Did they take out a short position on US tech stocks before releasing their model?

bought Ṁ50 YES

@skibidist that's a great reference

I believe that deepseek's paper does not actually say the number of gpus but instead says ~approximately how many hours h800 GPU time it would have required (and explains that this would be about two months in 2048 of those). I think that 1 month on twice as many GPUs (or h100s, which have equivalent flops) would be consistent with what they said. Would evidence indicating this is what they did result in a Yes resolution?

@Fay42 I think that would result in a No resolution. What I care about really is the compute budget, not the precise number of GPUs. This was clear in the resolution criteria but I will change the top-line question wording to be more clear.

bought Ṁ50 NO

What does "used" mean? As DeepSeek itself acknowledges, the compute cost of the final training run for V3 doesn't include the full cost of compute to run experiments and synthetic data, etc. Are we just talking about the main training run here?

"Assuming the rental price of the H800 GPU is $2 per GPU hour, our total training costs amount to only $5.576M. Note that the aforementioned costs include only the official training of DeepSeek-V3, excluding the costs associated with prior research and ablation experiments on architectures, algorithms, or data."

@JoshYou Great question, my intention is that "used" does refer to just the main training run here, or the # of concurrent GPUs used for the main training run.

bought Ṁ5 NO

This is an important question, thanks for making this market! Upgraded to Premium

People are also trading

Did DeepSeek receive unannounced assistance from OpenAI in the creation of their v3 model?

8% chance

Did DeepSeek violate OpenAI's terms of service by using OpenAI model outputs for distillation in 2024 or January 2025?

9% chance

Did DeepSeek use a cluster of more than 5,000 H100s in 2024?

30% chance

Will OpenAI’s claims that DeepSeek is a distillation of their models become the consensus view?

17% chance

Will there be an open replication of DeepSeek v3 for <$10m?

41% chance

People are also trading

People are also trading

Related questions