An AI model with 100 trillion parameters exists by the end of 2025?

1kṀ5692

Dec 25

22%

chance

ALL

Technical AI Timelines

AI Safety

Artificial Intelligence

Get

1,000

to start trading!

People are also trading

[Will Brown] Will Transformative AI look like an abundance of specialized models, in August 2026?

47% chance

Limits on AI model size by 2026?

15% chance

Will an AI model use more than 1e28 FLOPS in training before 2026?

8% chance

Will AI have a trillion+ dollar impact by the end of 2025?

9% chance

What will be the parameter count (in trillions) of the largest neural network by the end of 2030?

Will models be able to do the work of an AI researcher/engineer before 2027?

26% chance

100GW AI training run before 2031?

37% chance

AI: Will someone train a $1T model by 2050?

81% chance

AI: Will someone train a $1T model by 2080?

69% chance

AI: Will someone train a $100B model by 2050?

Sort by:

bought Ṁ200 NO

After underwhelming 4.5 release, I don't think anyone is going to train another oom larger for a while. They'll focus on inference-time scaling etc.

For reference, the current record (as far as publicly known) is M6-10T is 10T

This is way too ambiguous. If I merge 1400 llama 2 together, and make it a MoE 1400 choose 2, then continual train for a little bit. Would that count?

@Sss19971997 yes, if it reaches 100 trillion (I don't remember llama2 param count, and yes 1400 of them would be a shit ton)

bought Ṁ100 NO

Gpt4 was 1.8 trillion. Why does the market think 100 trillion is feasible within the next year and a half?

bought Ṁ50 YES

@firstuserhere (1) Burny — Effective Omni op X: 'GPT-4 was trained on 25k A100s over 90 days, but now you can do it with only 2k GPUs over 90 days and with similar amount of B100 units you could train GPT-4 in a week Or 20k in 9 days Or 200k in one day' / X (twitter.com )

@WieDan yes but b100s are presumably a lot more expensive too, and companies will take a fair bit of time to set up their clusters, especially if they recently set up h100s and then the model training for a 100tril param model is a lot of time too, don't think it'll happen

1.7 year is cutting it close they might miss it by 6 months but it's coming soonish regardless
We'll see how my prediction fares in 2025

@firstuserhere GPT-1 was 117M, GPT2 was 1.5B, GPT3 was 175B (the trend with the old scaling law)

GPT4 was 1.8T with a MOE setup.

So historically param count has 10x'd per generation.

https://arxiv.org/pdf/2202.01169.pdf

I'm not looking closely at this paper rn and this predates Chinchilla maybe conceptually but it vaguely seems like performance boosts from experts saturate past GPT-4 levels although I'm not sure if this applies to inference cost/speed.

@firstuserhere You never said it had to be any good. Making a bad model with 100T parameters ought to be rather easy, as long as you have the space to store them (I do not, however)

bought Ṁ20 NO

but, I'm gonna bet NO because it's big enough that I think nobody is going to bother doing it as a joke (e.g. a 100T param MNIST classifier...), and I think it's unlikely it's going to make enough business sense for someone to do it for real.