How many billion dollar AI training runs will occur in 2024?

This market resolves to the number of AI training runs that cost a total of $1 billion nominal dollars or more that are completed in 2024.

The majority of the training cost must occur in 2024 to count for this purpose of this market, but it is permissible for some of the training to have occurred earlier. The cost of the training run may include hardware costs, electricity, personnel, inference, pre-training, data acquisition, fine-tuning, etc. but only insofar as these are new expenditures for the purpose of training a new model (ie dataset acquisition associated with previous models will not count). Evals or other types of model testing, legal fees, licenses, or regulatory fines will not count toward the cost.

Caveat mercator: I foresee many edge-cases. Clarifications may be added to this market in the initial months after market creation to be in line with the spirit of the market.

Get Ṁ600 play money
Sort by:

GPT3 was 2020, approx $5 million

GPT4 was 2023, approx $60 million

Llama 2 was 2023, approx $10 million

~2 order of magnitude jump on multiple models in one year? You'd have to be very confident in the model's abilities to sink $1b into a run, never mind having the money in the first place.

@Tomoffer do those numbers include engineering salaries, data acquisition, and RLHF salaries like this market does, or only compute? Also I'm not sure if this market includes the cost of R&D and test runs or only the final training run.

Sam Altman said GPT-4 was over $100 million but didn't specify what that included

Maybe number of labs with expenditures >$1B in 2024 would be a better metric? Most public companies should report annual expenditures. But they might not break out the R&D expenses from the inference costs of Copilot/Bard

@ahalekelly unfortunately it looks like there's not as much transparency in public companies as I thought, Google's financial report only splits their expenses into four broad categories. But with the Google Brain/DeepMind merger this year they shifted it from one category to another so you can get a rough idea from how much the categories changed. My guess is the Google DeepMind budget was around $0.7-0.9B for Q2 and $1.1-1.3B for Q3. Q4 earnings come out in 2 days but the total is probably $3-5B for 2023.

And we know from UK filings that DeepMind's budget, pre Google Brain merger, was $1.2B in 2021 and $731B in 2022.

If there are 3 models made that fill the criteria, would the 1-2 option and the 3-5 option resolve yes or just the 3-5

@Rucker Only the 3-5 option

This seems like it will be pretty hard to resolve objectively. Huge clusters are being built out in various countries, and clear reporting on their costs may not be available even if we can tell that huge runs are happening. Even American labs are hesitant to describe their exact costs for runs. Government projects will be even more tight-lipped.

GPT-4 pretraining finished in 2022 but the public didn’t know this until 2023. This gap between training completion and deployment (if at all) will be an additional headache for resolution.

@AdamK - Indeed.
Furthermore, it is possible that some training runs might fail, and that might obscure their existence or costs.

And what if it is debatable if a project is 2 runs or not?
Maybe at attempt at innovating for GPT5 will cost $1 billion and fail, but then OpenAI might restart the project, fix the errors, and announce that 6th iteration as the product GPT5, and so is that 2 runs or 1 doubled-up run?
How would we even get the information in order to delineate and even begin to argue such a point if an edge case like that occured?