Will the best LLM in 2027 have <1 trillion parameters?

Ṁ1kṀ2.2k

Dec 31

11%

chance

ALL

Conversational LLM

Market context

Get

1,000

to start trading!

People are also trading

Will the best LLM in 2027 have <500 billion parameters?

12% chance

Will the best LLM in 2027 have <250 billion parameters?

12% chance

Will the best LLM in 2026 have <500 billion parameters?

13% chance

Which LLM has more parameters?

When will the first quadrillion parameter LLM be made?

More than 80% of all user queries to LLMs will be served by LLMs less than 10 Billion parameters in size by 2050?

50% chance

Will the highest-scoring LLM on Dec 31, 2026 show <10% improvement over 2025's best average benchmark performance?

72% chance

Will an LLM improve its own ability along some important metric well beyond the best trained LLMs before 2026?

14% chance

Will there by a major breakthrough in LLM continual learning before 2027?

43% chance

There will be one LLM/AI that is at least 10x better than all others in 2027

Sort by:

would bet more but NO on low % is inefficient.

but, while I would imagine most people/applications wouldn't be using the "best LLMs", those LLMS would be relying on at least that many parameters. We can assume we can at least close-to-chinchilla-saturate, probably fully-saturate 1.75 trillion parameters (GPT-4), and I have to imagine we'll be generating huge new swaths of synthetic data several orders of magnitude above what was used for GPT-4 that we'll be wanting to incorporate, so that means more parameters. Of course, we may get much more param efficient, but more is always better right, and if the compute is also getting massively cheaper which I'm sure it will by 4 years from now as NVIDIA competition is at full steam, even if it's nbd to run at fewer params, why not go big regardless if you're shooting for the best?

predictedNO

@TomPotter I guess I could imagine a scenario where compute is most efficiently used with smaller parameter matrices ... where it's really just about the ratio, a tradeoff of benefits for params/cycles, and maybe it could happen that that ratio gets pushed real far in the direction of cycles over parameters due to whatever the techniques of the time demanding. Still, 29% seems too high.

predictedNO

@TomPotter

I think there may be an answer described here:

https://www.youtube.com/watch?v=1CpCdolHdeA

at 8:00 in. He says that the amount of data and the size of the model [number of parameters] both scale as sqrt(compute).

And in the previous few minutes of the conversation he references the 100M for GPT-4 -> 1B -> 10B scaling of $$ going into compute * the (compute/$ increases from new GPU tech). So of course, there will be a huge increase in compute going toward the best 2027 model; Dario was only referencing til thru 2025, so that's another 2 years beyond that, and we're already talking ~100-300x GPT-4 compute in 2025 based on Dario's numbers.

So that means we should expect parameters to be ~ sqrt(300)*1.75 = 30.3 T.