Will Google mostly catch up to OpenAI in LLM quality and neutralize ChatGPT's lead by the end of 2024?

[This is Casey's medium-confidence prediction from the 12/22/23 episode of Hard Fork]

This market will resolve to yes if Google's Gemini Ultra or other state-of-the-art LLM is roughly equivalent to OpenAI's best publicly available LLM on December 31, 2024, and if Google's AI products have cut into ChatGPT's share of the consumer LLM chatbot market. Otherwise, it will resolve to no.

Get Ṁ600 play money
Sort by:
bought Ṁ100 NO

Why is this so high? Manifold has GPT-5 at 80% for a 2024 release

At this point, OpenAI's best model on the LMSYS leaderboard beats Google's best model 50.66% of the time; pretty close to a coinflip. Along with Gemini traffic at ~25% of ChatGPT's traffic, it seems like if this resolved today, it would resolve YES.

Obviously, the market is actually about the state of things on December 31st, so that's not dispositive, but I wonder if 40% is the right place for this to be sitting right now?

@ChrisPrichard OpenAI has not released a major model (GPT-4.5 or GPT-5) in 2024, and they say that they will. Google has caught up to GPT-4, but can they beat whatever the new one is? My guess would be no.

@dominic Yeah - that does seem to be the major uncertainty here!


Surprisingly larger traffic for Gemini than I'd have guessed, though I'm not sure what the bar for "cut into ChatGPT's share" is.

@KevinRoose18ac Can you clarify whether this market would resolve YES if it ended today? Currently users prefer Gemini Pro in the LMSYS arena about 43% of the time over the best GPT, so users still clearly prefer GPT, but not by so much that an individual user could easily tell which LLM they were talking to. Does this count as being "roughly equivalent"?

Additionally, by "cut into ChatGPT's share of the consumer LLM chatbot market" do you include integrated chatbot features such as email generation and spreadsheet formulation, or are you only counting direct Q&A-style chats?

There are so many concepts I want to bet on but the resolution criteria is so vague

These are some super vague terms

How will you measure the AI market share for each?

To get some more clarity of this, since it seems vague right now, if googles llms are similar in quality, and maybe are even better for some multimodal applications, but they cut into the chatgpt market share by only 15%, how would this resolve?

What if gemeni is allegedly better but those versions are not released broadly yet, while OpenAI's versions are?

what does this mean exactly? Does google need 20% of chatgpt's users? 80%? 100%?

bought Ṁ120 of YES

@jacksonpolack It seems a foregone conclusion that Bard "mostly" catches up in quality and "cuts into" GPT's market share given Google's wide reach.

Comment hidden

More related questions