Will it cost less than 100k USD to train and run a language model that outperforms GPT-3 175B on all benchmarks by the end 2024?
78%
chance

The final model does not have to cost 100k. If a model outperforms GPT-3 before 100k has been spent on training the market resolves yes, even if the model continues to be trained after that point.

Clarification: all benchmarks in the original GPT-3 paper.

Sort by:
meefburger avatar
Rickbought Ṁ0 of YES

Is it necessary for the new model to have been tested on all the benchmarks published for the 175B model in the original GPT-3 paper for this to resolve YES?

vluzko avatar
Vincent Luczkow

@meefburger Yes. I will make exceptions for any benchmarks that are/become unavailable, or are otherwise very difficult to access (e.g. very onerous licensing). I may consider making an exception for a model that completely blows GPT-3 out of the water but skips some minor benchmarks. But since the market only resolves yes in the case where the model is quite cheap to use, it seems likely that actually testing it against all the benchmarks will be feasible.

TomCohen avatar
Tom Cohen

100k nominal or inflation adjusted? If the latter, adjust from what starting point?

vluzko avatar
Vincent Luczkow

@TomCohen Nominal

TomCohen avatar
Tom Cohen

@vluzko Sweet, thanks for clarifying!

ValeryCherepanov avatar
Valery Cherepanovis predicting YES at 62%

I think it may be already possible. GPT-3 used 3e23 FLOPS. GPT-30B by MosaicML used 3 times less, 1e23 FLOPS and cost 450k$. Flan-T5-XXL used 3 times less, 3.3e22 FLOPS, so naively it should cost around 150k$ but probably <100k$ because Google has access to cheaper hardware.

Does Flan-T5-XXL outperform GPT-3 on all benchmarks? I don't know. This is not even a reasonable definition, you can make a benchmark which will specifically prefer GPT-3 to all current models.

But it is significantly better on MMLU 5-shot (55 vs 44) which is a strong signal that it might actually be generally better.

I would give 95% that a model which is reasonably better than GPT-3 will be trained for <100k$ by 2024 and maybe 85% that this market will be resolved as yes (the model can be not public, it may be hard to estimate cost, it may be hard to say it's clearly better than GPT-3).

vluzko avatar
Vincent Luczkow

@ValeryCherepanov It's specifically all benchmarks in the original GPT-3 paper, not "all benchmarks imaginable"

ValeryCherepanov avatar
Valery Cherepanovbought Ṁ10 of YES

MosaicML trained GPT-3 quality LLM for 450k about 1 month ago.

Gigacasting avatar
Gigacastingsold Ṁ5 of NO

Stable diffusion trained for $600k, arguably ~$200k @ aggressive spot pricing.

https://twitter.com/jackclarkSF/status/1563957173062758401

Gigacasting avatar
Gigacasting

Big labs continue to be terrible at training efficiency (e.g. one paper beat AlphaGo with ~50x less compute from better sampling and architecture), with stability.ai in play ***AND** their open-source approach, someone might pull this off

Gigacasting avatar
Gigacastingbought Ṁ5 of NOCost ~$10m to train in 2020. Costs ~$1m to train 4.5 yrs later (halves per 18mos) Leaves ~10x improvement in approach to tie it, much more to exceed across the board. Note that costs are still $10mm today to train Palm/Megatron/Chinchilla, with no evidence of training (rather than inference) efficiency gains.