Will $10,000 worth of AI hardware be able to train a GPT-3 equivalent model in under 1 hour, by EOY 2027?

Using the best estimates we have at the time, using actually-purchasable hardware.

Generative Pre-trained Transformer 3 (GPT-3) is a large language model released by OpenAI in 2020. Like its predecessor, GPT-2, it is a decoder-only[2] transformer model of deep neural network, which supersedes recurrence and convolution-based architectures with a technique known as "attention".[3]

It uses a 2048-tokens-long context[jargon], float16 (16-bit) precision, and a hitherto-unprecedented 175 billion parameters, requiring 350GB of storage space as each parameter takes 2 bytes of space, and has demonstrated strong "zero-shot" and "few-shot" learning abilities on many tasks.

"GPT-3 equivalent" in terms of floating point operations that were needed to train gpt3, as well as space requirements, energy requirements, etc. Algorithmic improvements that make a smaller model as good as gpt3 as an LLM would not count for the purpose of this question.

Get Ṁ600 play money
Sort by:

What if it's trained on a synthetic curriculum instead of raw common crawl ?

obviously resolves NO lmao (do the math guys, it's not close)

bought Ṁ100 NO from 23% to 20%
sold Ṁ59 YES

Please define GPT-3 equivalent

Equivalent performance or number of parameters? If there are algorithmic breakthroughs that allow a silimar model performance with much fewer parameters would this count?

@ChrisProsser clarified in the description, does that answer your question

@Bayesian yes, that clarified it, thanks.

More related questions