Dec 31
Will GPT-4 have over 1 trillion parameters?
58%
chance

When researching GPT-4 speculations, the estimated parameter count ranged anywhere from 175 billion (same as GPT-3) to over 100 trillion.

This market will resolve YES if GPT-4 have over 1 trillion parameters, otherwise it will resolve NO.

Sort by:
L avatar
Lsold Ṁ215 of NO

I still think it's unlikely they scaled that much, but after thinking about it, I'd be surprised if they didn't scale a bunch

R2D2 avatar
R2D2is predicting NO at 54%

@L Against my own self-interest as a NO bettor, I made a few calculations and I'm afraid it could be >1T (f*** OpenAI!!!! Couldn't you just go with a big Chinchilla?! No, you had to go & kick PaLM's ass....) However, I could still hope the exact number doesn't get revealed until next year, hehe. I'll think about whether to show my calculations or not. I'll try tobdouble check with a few acquaintances of mine before (they're researchers at Ivy League unis)

PatrickDelaney avatar
Patrick Delaneyis predicting NO at 56%

@R2D2 what uni's ? Doesn't ivy league suck at ai?

L avatar
Lis predicting NO at 58%

@R2D2 you think it's MoE?

kenakofer avatar
kenakoferis predicting NO at 58%

If GPT4 has exactly 1 trillion parameters, this will resolve NO, correct?

VictorLi avatar
Victor Li
VictorLi avatar
Victor Li

@VictorLi oh yea just to clarify this is the new acc of the market maker lmfao (the markets on the old acc will still be resolved dont worry)

GrahamPoulter avatar
Graham Poulterbought Ṁ15 of NO

100 Trillion is ridiculous for now, maybe in 10 years - and once we've figured out how to avoid evaluating all of them to make a prediction.

I think OpenAI hardware is going to struggle with 1 Trillion.

I estimate that GPT-4 has more in the range of 300-500 billion parameters.

LeoSpitz avatar
Leo Spitzbought Ṁ60 of YES

Semafor: The latest language model, GPT-4, has 1 trillion parameters.

They were also the first to report that Bing is GPT-4.

PatrickDelaney avatar
Patrick Delaneybought Ṁ10 of NO

@LeoSpitz No they did not, it was from this Tweet, directly from a VP at Microsoft. https://twitter.com/yusuf_i_mehdi/status/1635733309631389696

PatrickDelaney avatar
Patrick Delaneyis predicting NO at 64%

@PatrickDelaney They reported that Bing will eventually include GPT-4 without certifying any knowledge that it, "is" GPT-4, which came from the above tweet.

PatrickDelaney avatar
Patrick Delaneyis predicting NO at 64%

@NexVeridian Right, I know, saw that, thank you...but... "poised to incorporate" is not the same as, "is."

Mason avatar
GPT-PBot

GPT-4 with trillion parameters,
AI language model of great measure,
Will it surpass its former gain?
Better bet on that, my friend, it's insane.

Mira avatar
Mirabought Ṁ200 of YES

If it can be anywhere from ~0 to 100 trillion, the expected value is 50 trillion. So this market is priced too low.

EduardoFilippi avatar
Eduardo Filippiis predicting NO at 35%

@Mira Probability within that range is nor evenly distributed. For example, the age of a human can be anywhere between 0 and 122 doesn't mean the expected age of a human would be 61.5 years

R2D2 avatar
R2D2is predicting NO at 31%

@Mira this couldn't be more wrong. 1) interval constraints are not the same as uniform probabilities https://www.stat.berkeley.edu/~stark/Preprints/constraintsPriors13.pdf 2) even if they were (and they aren't), there's no reason to assume a uniform pdf over said interval 3) using non-informative priors when you actually have prior knowledge is a big mistake. And we do have prior knowledge here: we know that Sam Altman debunked the 100 trillion fake news in the StrictlyVC interview. Also, there's no way they could train a 100 trillion attention-based model in Q1-Q2 '22, just on elementary cost considerations alone. So nope, the expected value is not what you claim it to be.

JimHays avatar
Jim Haysis predicting NO at 31%

Don’t feed the trolls :)

R2D2 avatar
R2D2is predicting NO at 32%

@JimHays you sure she's trolling? She did buy >10000 YES shares

RobertCousineau avatar
Robert Cousineauis predicting NO at 32%

@JimHays if Mira is a troll, they're throwing away a very large amount of mana.

I just don't get their actions though. If I had more mana I'd bid this down further, but I don't.

vluzko avatar
Vincent Luczkowbought Ṁ100 of NO

@Mira this is... a joke, right?

jack avatar
Jackis predicting NO at 32%

Mira is trading very reasonably (buying a lot of shares at a reasonable price), and posting a comment that is an obvious troll :)

JimHays avatar
Jim Haysis predicting NO at 32%

👆🏼

R2D2 avatar
R2D2bought Ṁ36 of NO

@jack if she sells them (enough time) before market close, then that's reasonable trading, otherwise...:-)

L avatar
Lsold Ṁ1,958 of NO
R2D2 avatar
R2D2is predicting NO at 37%

A model that completed training in August 2022 has > 1T params? Seems unlikely? Also, the "paper" 🤮 seems to indicate that it was trained for way longer than other models, which could suggest a "big Chinchilla" kind of model/training schedule, rather than a "bust PaLM's ass" kind of model/ts

R2D2 avatar
R2D2is predicting NO at 37%

@R2D2 GPT-5 is being trained on 25k GPU so it's probably >1T, but I don't think GPT-4 is as big. At least, not the 8k version

firstuserhere avatar
firstuserhere

@R2D2 lmao agreed on "paper" barf. It is a 98 page report, half of which is an ad.

firstuserhere avatar
firstuserhere

@firstuserhere yep, i don't know what info I'm not aware of has pushed all these markets so high but my estimate remains that gpt4 used probably 300-500 billion parameters, no early stopping, same cut off date as 3.5 has but trained for much much longer.

R2D2 avatar
R2D2is predicting NO at 37%

@firstuserhere Yep, if 1) it ever gets submitted somewhere and 2) I happen to be the reviewer, I’m gonna hard-reject it sooner than I can say “ClosedAI”. I’m gonna be like, (Reviewer 2)^512

firstuserhere avatar
firstuserhere

@firstuserhere hahahaha

R2D2 avatar
R2D2is predicting NO at 36%

@firstuserhere I agree, I don’t think it wa bigger than the biggest PaLM

firstuserhere avatar
firstuserhere

@R2D2 yep yep, using that as my upper bound as well. I've heard multiple OpenAI people's and even saltman's comments here and there which fit in with the estimate

Nikola avatar
Nikolabought Ṁ100 of NO

At around 1:46:00 of the microsoft reinventing productivity event, they described their LLM, probably GPT-4, as having "billions" of parameters. Not sure how much info that is.

NexVeridian avatar
NexVeridian

Brett Winton from ark invest says 80b parameters

https://twitter.com/wintonARK/status/1636100290264043520

ErickBall avatar
Erick Ball

@NexVeridian wait, why would anybody think 3.5 turbo is 20b parameters? That would make it 4x smaller than chinchilla and 9x smaller than gpt3. I know the scaling laws have swing more towards data but that's ridiculous.

parafactual avatar
celesteis predicting NO at 64%

@ErickBall why else do you think it's so much cheaper?

dayoshi avatar
dayoshi

@ErickBall Chinchilla optimized for training flops, but if you're spending more money on inference than training (which presumably OpenAI is) you'll want to go even smaller. The llama paper made this point. Basically if you look at the empirical data from Chinchilla etc, we haven't saturated the performance of the smaller models, so if you're willing to spend extra training flops you can get more juice out of a smaller model.

ErickBall avatar
Erick Ballsold Ṁ10 of YES

@dayoshi well when you put it that way, yeah, that actually makes a ton of sense. Now I don't really understand why this market is so high.

jack avatar
Jackbought Ṁ200 of YES
NoaNabeshima avatar
Noa Nabeshimais predicting YES at 65%

If GPT-4 is dense and 2 OOMs of compute more than GPT-3 (3.14e23 FLOP according to LambdaLabs), it's just under 1T parameters:

Eyeballing it, it looks like 2.5 OOMs to get 1T if trained Chinchilla-optimally, which GPT-4 might not be because of inference costs?

NoaNabeshima avatar
Noa Nabeshimais predicting YES at 65%

@NoaNabeshima A little more than 2.5 OOMs

NoaNabeshima avatar
Noa Nabeshimais predicting YES at 61%

@NoaNabeshima https://www.metaculus.com/questions/9519/flops-used-for-gpt-4-if-released/

metaculus community prediction assigns prob that it's >2 OOM greater at ~10%

NoaNabeshima avatar
Noa Nabeshimabought Ṁ100 of NO

@NoaNabeshima so looks like this is mostly a mixture of experts market?

NoaNabeshima avatar
Noa Nabeshimais predicting YES at 61%

@NoaNabeshima oh but also they could have separate params for images

firstuserhere avatar
firstuserheresold Ṁ66 of NO

@NoaNabeshima this seems surreal

ErickBall avatar
Erick Ball

@NoaNabeshima does this figure imply that it was trained with >1000x the compute of any previous openai LLM?

NoaNabeshima avatar
Noa Nabeshimais predicting YES at 59%

@ErickBall The grey dots probably are newly trained models so probably this doesn't imply that

R2D2 avatar
R2D2bought Ṁ30 of NO

@NoaNabeshima No, the "paper" explicitly states that this extrapolation was performed before end of training, abs training completed in 2022, so they're not newly trained models

grahambo avatar
grahambosold Ṁ22 of YES

I've recently flipped from LONG to SHORT because of this: https://www.theverge.com/23560328/openai-gpt-4-rumor-release-date-sam-altman-interview

rockenots avatar
rockenotsbought Ṁ50 of YES

This guy predicted the release date and the multimodality. I'm gonna trust them on the parameter count.

rockenots avatar
rockenotsis predicting YES at 40%

@rockenots Tweet that leaked the release date and multimodality: https://twitter.com/apples_jimmy/status/1629939273469394945

ErickBall avatar
Erick Ball

@rockenots 125 trillion would be a ridiculous increase over 1-2 trillion, even bigger than the 100 trillion that Sam Altman confirmed was bs.

wadimiusz avatar
Vadim

I don't see the number of parameters in the blog post or the technical report.

Nikola avatar
Nikolais predicting NO at 40%

@wadimiusz GPT-4 is a Transformer-style model [33] pre-trained to predict the next token in a document, using both publicly available data (such as internet data) and data licensed from third-party providers. The model was then fine-tuned using Reinforcement Learning from Human Feedback (RLHF) [34]. Given both the competitive landscape and the safety implications of large-scale models like GPT-4, this report contains no further details about the architecture (including model size), hardware, training compute, dataset construction, training method, or similar.

Arguably this market should be taken down. I don't have a strong take on this though.

wadimiusz avatar
Vadim

@Nikola Perhaps we need to wait, not resolve as ambiguous right away?

Nikola avatar
Nikolais predicting NO at 40%

@wadimiusz I don't just mean "OpenAI didn't say therefore this market is ambiguous", i also mean something like "OpenAI doesn't want this public, and this market is a mechanism for making it public, so this market should be taken down because infohazards"

Adam avatar
Adam

@Nikola this feels like a ridiculous take, to me; why should we care if some company would prefer if something stayed a trade secret?

Mira avatar
Mirais predicting YES at 61%

@Nikola That sounds like their problem, not mine.

xyz avatar
Yoav

@Nikola Even if this market is taken down, someone will inevitably post it here and it will be revealed.