
Recently there has been a debate about how many GPUs DeepSeek uses in the training of its language models. The DeepSeek-v3 paper claims that only 2048 NVIDIA H800s were used[1], but others claim that they might have had as many as 50,000 H100s[2] (note: the H100 is the default GPU, the H800 is a gimped version of the H100 to comply with export controls).
The market will resolve NO if either:
DeepSeek-v3's performance is successfully replicated using no more than 2x the claimed compute budget
There is insufficient evidence to conclude that DeepSeek misrepresented their compute usage at market close (default NO)
The market will resolve YES if there is widespread agreement in the AI community at market close that DeepSeek used significantly more compute resources than claimed in their technical report.
I will not bet in this market.
[1] DeepSeek-V3 Technical Report
https://arxiv.org/abs/2412.19437
[2] CEO of Scale AI claiming DeepSeek has access to 50,000 H100s
https://youtu.be/x9Ekl9Izd38?si=yqstFkBxP9ICnxf_&t=170
Update 2025-27-01 (PST) (AI summary of creator comment): Clarification on "used":
"Used" refers exclusively to the main training run of DeepSeek-v3
It includes the number of concurrent GPUs employed during the main training process