Will more than 20 organizations publicly train large language models by 2024?

111

2kṀ58k

resolved Jan 9

Resolved

ALL

For this market a large language model is a language model trained using an amount of compute that is within an order of magnitude of the compute used to train the largest language model
It is not just based on parameter count.
I'll accept starting with an pretrained model and then doing additional training/finetuning as long as the amount of compute for the latter component is large enough.

By publically I just mean that it's well known that they trained the model. If, say, the Chinese government almost certainly has one but it's not definitely confirmed then that doesn't count.

LLMs

New Year's Resolutions 2024

Get

1,000

to start trading!

🏅 Top traders

#	Name	Total profit
1		Ṁ982
2		Ṁ958
3		Ṁ476
4		Ṁ184
5		Ṁ145

People are also trading

Will a Large Language Model save a human life through medical advice by the end of 2025?

94% chance

Will a Large Language Model be listed as an author on a peer-reviewed paper by the end of 2025?

34% chance

Will any 10 trillion+ parameter language model that follows instructions be released to the public before 2026?

33% chance

13% chance

Will any language model trained without large number arithmetic be able to generalize to large number arithmetic by 2026?

46% chance

Will there be an AI language model that strongly surpasses ChatGPT and other OpenAI models before the end of 2025?

18% chance

Will Transformer based architectures still be SOTA for language modelling by 2026?

81% chance

End of pre-training era for language models: Will an LM fine-tune for more FLOPs than it is pre-trained for, before 2026

59% chance

Will AI (large language models) collapse by may 2026?

7% chance

In 2030, will there be more than 10 $5bn companies that are some form of large language model focused on a specific task. ie not Microsoft, not OpenAI,

Sort by:

Time for a 2025 market?

@MartinRandall I'm thinking about what similar markets I'm interested in, but I don't think I'll duplicate this one exactly.

I've reviewed Epoch AI's dataset + their methodology (https://docs.google.com/spreadsheets/d/1AAIebjNsnJj_uKALHbXNfn3_YsT6sHXtCU0q7OIPuc4/edit) and I'm satisfied with it. They have Gemini Ultra as the most compute intensive model at ~9e25 FLOP. With that as our estimate we have only 3 orgs in range: Google, OpenAI (GPT-4), and Inflection (Inflection 2). If we are generous and say Gemini Ultra was 1e25 FLOP the list expands to include Anthropic (Claude 2), HuggingFace (Falcon 180B), AliBaba (Qwen 72B), Microsoft and Nvidia (Megatron), Zhipu AI and Tsinghua KEG (ChatGLM3), and Baidu (ERNIE 3.0).

@vluzko do the count!

@nikki Yeah I'm working on it.

I know this is a No, but can we get an example of any model that'd have counted for the market? Just want to give it a closure i suppose.

Would FLM101B have counted? It's got 100billion+ parameters. What about the Llamas? Zidong taichu (100billion)? Xverse (65bil)? Ik market doesn't resolve on parameter count only, but we do know a minimum of the dataset these were trained on to give a rough estimate of the compute

@firstuserhere anyone?

@firstuserhere You might be interested in https://epochai.org/blog/announcing-updated-pcd-database

predictedYES

“For this market a large language model is a language model trained using an amount of compute that is within an order of magnitude of the compute used to train the largest language model”

Is the “largest” adjective evaluated as of when the question was asked or as of resolution?

Seems like some comments imply the latter which doesn’t make much sense to me…

@Tyler31 it's resolution. What doesn't make sense?

predictedYES

@vluzko A. At time of asking seems like the standard/intuitive way to interpret a phrase like this, imo.

B. It seems like a more straightforward, informative question to have wanted to ask.

C. At resolution seems less practical to evaluate.

D. How do I know when you will resolve? What if on Jan 1 it’s unclear so you wait a bit to try to assess, but by that time then another model has been released…?

https://ourworldindata.org/grapher/artificial-intelligence-training-computation

Currently three LLMs within one order of magnitude of the largest (gpt4). Will there be another 17?

Computation used to train notable artificial intelligence systems

Computation is measured in total petaFLOP, which is 10¹⁵ floating-point operations.

"larg

To the people betting YES: why? I do not bet on any of my AI markets, but if I was I would bet heavily on NO for this one. Most orgs training LLMs seem to be training at ~GPT-3 scale, which by most estimates is about 2 OOM less compute than GPT-4, and so it is likely that none of them will qualify. Do you think GPT-4 didn't actually use that much compute? Do you think that actually most orgs are training beyond GPT-3 scale? Do you think that even if that's not the case now that it will be in the next 6 months? (I am reasonably confident that any training runs that start now will not be finished in time for the market resolution, especially given the shortage of A100s, so 3 seems particularly unlikely)

predictedYES

@vluzko I thought this market ended on the end of 2024, not the 1st of January but I'll ride it to 0 since i'm already at -32%

@NiciusB truly not checking the exact resolution date is manifold's greatest villain

predictedNO

@vluzko close date doesn't always mean resolution date, there are exceptions. I think you should have specified in the description.

predictedYES

@ShadowyZephyr Technically “by 2024” does mean the 1st of Jan. I just didn’t think about it too much

@ShadowyZephyr ??? Close date and resolution date are assumed the same unless otherwise stated on... basically every market on Manifold. If you wanted explicit clarification you should have asked.

predictedNO

@vluzko some people do early close dates. it's rare, but it's always good to cover everything in the description

@vluzko My best guess is that people are betting without reading the market definition of "large language models". It sure makes me nervous being the biggest NO holder here.

predictedNO

@vluzko Again, PLEASE do not resolve this market based on "estimates" of GPT-4's compute, to do so is invalidating both the title (which frankly was already useless) and even the description. You don't know anything about GPT-4's compute, you can only guess, it's a Fermi-level problem at best. I don't get what everyone's fixation on this is.