When will OpenAI release a more capable LLM?
243
3.7K
4.4K
Dec 31
0.2%
2023
15%
First half of 2024
58%
Second half of 2024
27%
2025+

(Mostly self-explanatory. To clarify, GPT 4.5 or GPT-5 would count. A new version of GPT-4 with a larger context window won’t)

Get Ṁ200 play money
Sort by:

Reminder: it has to be more capable; if it’s just faster and has a large context window, that doesn’t count.

bought Ṁ25 2025+ YES

@ms It seems pretty likely that it gets a higher score on at least 1 benchmark and is thus "more capable".

@MiraBot if the benchmark is about something relevant to capabilities, and the model doesn’t get lower scores on other benchmarks, it’s probably “more capable”

sold Ṁ316 Second half of 2024 NO

I made a market with fortnightly options to see if anyone wants to bet on something more specific than before/after July:

Apples only ever claimed the December release was "potential", so any small delay could swing these markets based on the end of the year. If you think the rumors are believable but a delay is too likely for you to bet here, come bet on this market that allows until the end of Q1 2024:

bought Ṁ1,000 of 2023 NO

I'm confused about why I'm betting against @Mira on 2023. I don't suppose we'd care to state our trading reasons out loud? Mine is just that OpenAI folks denied it, and I don't expect them to tell lies falsified over that short a timescale.

bought Ṁ20 of 2023 YES

@EliezerYudkowsky

As the second biggest yes holder, my reasoning is:

0) Mira is the biggest yes holder.

1) Ignoring all the rumors and denials, it would make sense as a response to Google claiming Gemini Ultra outperforms GPT-4. It would also line up with OAI announcing a bunch of safety things this last week, which could be them trying to show "balance" between safety and capabilities.


2) My understanding is that these rumors started with the 🍎&🌸 accounts. Other people picked up the rumor, and hype grew because they've been correct about things before. They have not backed down, and instead they've said that OAI is trolling.

3) After the hype started with 🍎&🌸, there was that screenshot posted to Reddit with 4.5 token prices. Then people started asking ChatGPT to identify its own model, and it said it was 4.5. The screenshot is what Sam was asked about when he said "Nah", and Depue said the self-identification is a hallucination. The case for a 2023 4.5 is that those two things can both be fake, while the underlying rumors are based on a real possibility that OAI is planning a holiday release.

Even if I'm right about all of this, there could still be delays of course. I'm hedging in other markets. But I wouldn't put this below 20%.

@Joshua Also, maybe to make your (0) explicit, @Mira has at least some history of having insider information and betting on it like this (and thereby taking my money!), so I put more weight on it.

@Joshua Could be wrong, but I believe most of the rumors originated on Reddit. First there was one person that claimed that ChatGPT read his entire book draft and understood it, therefore having no context window. Then there was the deleted screenshot that purportedly leaked the webpage showing GPT 4.5 modalities and cost. Then people started posting subjective opinions about how ChatGPT became smarter (that was both on twittee and Reddit), and finally there was ChatGPT saying that gpt-4.5-turbo is the api version, which also originated on Reddit. But it's hard to know because everyone reposts everyone.

Anyways, the entire thing seems to me like people circlejerked themselves into mass delusion. I've seen that happen on Reddit more times than I can count.

Since we're all just trading on which anon account we trust I have made a market to compare them:

The stonk market is a meme but I'd be interested in somehow formalizing a way to keep track of how often accounts like this were correct in a market.

Maybe an unlinked market asking "Will [account] make an unambiguously false prediction by market close"? And then you re-add a name after it resolves yes, so you keep track of how often each account said something which turned out to be false.

Open to suggestions.

@Joshua Isn’t it just create a market. x is Correct about y?

bought Ṁ0 of 2023 NO

Well we do that all the time, but it's hard to actually keep track of which "insiders" are reliable what % of the time. If you say enough vague things, eventually one of them is going to be true. And we can't make a market for every single claim they make.

Roon says:

"you guys need to develop more resistance to crazy ai hype bros

there’s no 4.5 and if there was it wouldn’t be released silently and if it was released silently you wouldn’t have the api string self doxx as 4.5"

bought Ṁ100 of 2023 NO

Sam Altman explicitly denies the 4.5 leak:

https://twitter.com/sama/status/1735422206296088950

If an LLM says it’s GPT4.5, it doesn’t mean it’s actually GPT4.5. Even if its prompt says it’s GPT4.5, it doesn’t mean it’s GPT4.5. I’m going to resolve based on official/credible information.

If some benchmarks show that the model available now is more capable than GPT-4 and OpenAI later says they actually released GPT-4.5 without announcing it in 2023, the question resolves in 2023; otherwise, it has to be credible evidence on benchmark performance. Whatever the LLM itself says isn’t as relevant

bought Ṁ65 of 2023 YES

@mvdm GPT4.5 Turbo is confirmed realeased and undergoing live testing

bought Ṁ30 of 2023 NO

@JohnOFarrar says who

bought Ṁ20 of 2023 YES

🎄

bought Ṁ2 of 2023 NO

I love watching these markets to figure out who has insider information 👀

@Broseph I keep clear of those markets due to missing resolution criteria 😜

@Primer gpt 4.5 would qualify. A version of GPT-4 with increased context window won’t

@Ophiuchus Better sell those No shares

@JohnOFarrar GPT 4.5 has not released as of today, and if market manager resolved to yes based on Reddit posts relying on GPT telling the truth about itself, I will be contesting that resolution.