Will a GPT-4 level system be trained for <$1mm by 2030?

1kṀ9955

resolved Aug 8

Resolved

YES

ALL

— LLM & AI Capabilities—

Get

1,000

to start trading!

🏅 Top traders

#	Name	Total profit
1		Ṁ281
2		Ṁ69
3		Ṁ45
4		Ṁ15
5		Ṁ14

People are also trading

Will a GPT-4 quality model be trained for under $10.000 by 2030?

86% chance

Will a GPT-3 quality model be trained for under $1,000 by 2030?

87% chance

Will a GPT-3 quality model be trained for under $10.000 by 2030?

98% chance

Will any nation's military budget be used to train an AI system that consistently beats GPT-4's SAT scores by 2026?

15% chance

Will it cost $30 to train a GPT-3 level model in 2030?

31% chance

Before 2028, will anyone train a GPT-4-level model in a minute?

29% chance

GPT-5 level model runnable on phones by 2030?

41% chance

Will $10,000 worth of AI hardware be able to train a GPT-3 equivalent model in under 1 hour, by EOY 2027?

16% chance

GPT-Zero: By 2030, will anyone develop an AI with a massive GPT-like knowledge base that it taught itself?

33% chance

GPT-4 performance and compute efficiency from a simple architecture before 2026

Sort by:

@traders I'm fairly convinced by the below arguments. If it was closer this might be a mess, but there's a good amount of margin on costs. If you think this is a mistake please speak up.

@mods Resolves as YES (Gigacasting has been inactive for over a year). See comment below for the detailed calculation. @Fynn raised the objection that the compute costs declared by OpenAI don't seem to include the costs for generating synthetic training data, but that doesn't seem like a valid objection because

a) Synthetic data is reused for training a whole bunch of models and refreshes, so attributing the costs to a single model does not reflect the actual amortized costs
b) Inference is much cheaper than training

Given a) and b) it should be very clear, that the amortized, training adjacent costs can't be more than the training costs. These direct costs are well below 500,000$ so the total stays under $1mm.

The details are also included below in my answer to @Fynn

@Gigacasting Resolves as YES.

gpt-oss-20b fits the bill with just about 210,000 H100 GPU hours of training (page 5 of the model card https://cdn.openai.com/pdf/419b6906-9da6-406c-a19d-1bb078ac7637/oai_gpt-oss_model_card.pdf)

Vast.ai rents the H100s for less than $2 per hour, meaning that OpenAI trained gpt-oss-20b for about 420,000$:

gpt-oss-20b destroys GPT-4 in a direct comparison. It's not even close. GPT-4 still occasionally struggled with primary school math. gpt-oss-20b aces competition math and programming, while performing at PHD level on GPQA (page 10 of the model card, see screenshot below):

AIME 2024 (no tools)
• 20b: 92.1%
• GPT-4 (orig): ~10–15% (proxy: GPT-4o reported 12% and the 2023 GPT-4 wasn’t better on AIME-style contests). Result: 20b crushes it.
(https://openai.com/index/learning-to-reason-with-llms)
AIME 2025 (no tools)
• 20b: 91.7%
• GPT-4 (orig): no reliable public number; based on AIME-2024 behavior, likely ≤20%. Result: 20b ≫ GPT-4.
GPQA Diamond (no tools)
• 20b: 71.5%
• GPT-4 (orig baseline): ~39%. Result: 20b ≫ GPT-4.
(https://arxiv.org/abs/2311.12022?utm_source=chatgpt.com)
MMLU (5-shot)
• 20b: 85.3%
• GPT-4 (orig): 86.4%. Result: roughly parity (GPT-4 a hair higher).
(https://arxiv.org/pdf/2303.08774)
SWE-bench Verified
• 20b: 60.7%
• GPT-4 (orig): 3.4% (20b simply crushes GPT-4)
(https://openreview.net/pdf?id=VTF8yNQM66)
Codeforces Elo (no tools)
• 20b: 2230 (with tools 2516)
• GPT-4 (orig): no official Elo; GPT-4o scored ~808. So original GPT-4 likely sub-1k on this setup. Result: 20b ≫ GPT-4.
(https://arxiv.org/html/2502.06807v1)

@ChaosIsALadder Gigacasting is no longer active, you would have to ping the mods. In the case of gpt-oss I disagree because it is likely trained on lots of synthetic data, which costs compute to generate. Though I think this question should resolve yes for sure this year.

@Fynn The problem with counting synthetic data towards the training of this model is that the synthetic data is not only used for training one model. It's reused over and over again, meaning that the amortized cost is low. That's why no company is attributing the full costs to a single particular model.

Aside from that, inference is much less costly than training (about 9x, see calculation below), and the training costs are only 0.42 million. Even if the cost of generating the same synthetic data were attributed to each model repeatedly (which it shouldn't), it's not plausible that the generation costs more than the training.

Costs for inference vs. training: Inference is a forward pass only that requires just two FLOPs per parameter (one addition and one multiplication). Training is 6 FLOPs (2 FLOPs for the forward pass and at least 4 FLOPs for backpropagation). That would be 2 FLOPs for inference vs. 6 FLOPs for training, but they're not the same FLOPs. Inference is nowadays typically done on 4-bit, while training has to be done at higher precision, typically a mixture of 16 and 32-bit. It's still at least 3x slower per FLOP than 4-bit inference, so inference ends up being at least 9x faster than training.

wait actually, @Gigacasting what costs are you asking? Compute only?

2022 (gpt-4 was trained): Saltman said (in 2023) GPT-4 took about ~100 Million to train.

@firstuserhere so by moores law gains alone you would expect 16 fold reduction, bring you down to 6 million. There are also chip architecture gains not just from adding transistor, training efficiency gains, finding ways to filter the training data to not waste compute on worthless low information examples. (For example trying to memorize hashes or public keys that happen to be in the training set).

Also if the gpt-4 source is similar to the gpt-3 source it's a tiny python program of a few thousand lines. Open source versions exist and over the next 7 years many innovations will be found that weren't available to OAI.