Will Google mostly catch up to OpenAI in LLM quality and neutralize ChatGPT's lead by the end of 2024?
➕
Plus
709
Ṁ140k
Jan 1
31%
chance

[This is Casey's medium-confidence prediction from the 12/22/23 episode of Hard Fork]

This market will resolve to yes if Google's Gemini Ultra or other state-of-the-art LLM is roughly equivalent to OpenAI's best publicly available LLM on December 31, 2024, and if Google's AI products have cut into ChatGPT's share of the consumer LLM chatbot market. Otherwise, it will resolve to no.

Get
Ṁ1,000
and
S3.00
Sort by:

getting serious out there

idk but NotebookLM is pretty good imo

sold Ṁ28 NO

Spelling off my position, as I have no idea how this will resolve. It seems like they are very close on capabilities but behind quite a bit on popularity and the criteria are too vague for my liking.

Gemini market share is falling, this is a sure no

https://firstpagesage.com/reports/top-generative-ai-chatbots/

It seems Gemini is "roughly equivalent" right now according to Chatbot Arena, but I'm not sure where to find numbers on ChatGPT's share of the chatbot market.

@JasonDavies Hard to tell about how to compare given Google doesn't have something like o1?

I mean, obviously that's not the top of the raw leaderboard here, but it does seem to be a potential significant lead for OpenAI in an approach the might lead to future scaling?

ChatGPT still has vastly more market share, but criteria are very vague here, which makes the bets less illuminating.

I think their models are clearly basically as good and have been for a while, but Google is bad at getting consumers to use them. The resolution criteria are really vague, though, so I am not going to bet much on this.

REMINDER: This cannot resolve until January 1, per question wording, it depends on the situation at year's end.

bought Ṁ2,000 YES

Seems like Gemini pro is #1 right now. Resolve to YES?

Does this resolution go by Hard Fork's own decision?

If instead judged by @KevinRoose18ac, is this strictly about chatbots, or LLM's in general? For example, you could see Gemini being ~natively present in lots of Android phones, putting them at roughly equal footing with OpenAI's iOS play. But when it comes to chatbots, as in visiting chatGPT.com, I think OpenAI wins by a landslide.

Part of it will depend on what was meant by "cut into ChatGPT's share", I think. If that earlier report was right and Gemini's web interface is at 25% of ChatGPT, does that count? Or does it have to reach 51% to count?

bought Ṁ100 NO

Why is this so high? Manifold has GPT-5 at 80% for a 2024 release

Gemini 2.0

At this point, OpenAI's best model on the LMSYS leaderboard beats Google's best model 50.66% of the time; pretty close to a coinflip. Along with Gemini traffic at ~25% of ChatGPT's traffic, it seems like if this resolved today, it would resolve YES.

Obviously, the market is actually about the state of things on December 31st, so that's not dispositive, but I wonder if 40% is the right place for this to be sitting right now?

@ChrisPrichard OpenAI has not released a major model (GPT-4.5 or GPT-5) in 2024, and they say that they will. Google has caught up to GPT-4, but can they beat whatever the new one is? My guess would be no.

@dominic Yeah - that does seem to be the major uncertainty here!

https://twitter.com/natfriedman/status/1777739863678386268?t=Acv_z3u7bB2q6F0kNvozrQ&s=19

Surprisingly larger traffic for Gemini than I'd have guessed, though I'm not sure what the bar for "cut into ChatGPT's share" is.

@KevinRoose18ac Can you clarify whether this market would resolve YES if it ended today? Currently users prefer Gemini Pro in the LMSYS arena about 43% of the time over the best GPT, so users still clearly prefer GPT, but not by so much that an individual user could easily tell which LLM they were talking to. Does this count as being "roughly equivalent"?

Additionally, by "cut into ChatGPT's share of the consumer LLM chatbot market" do you include integrated chatbot features such as email generation and spreadsheet formulation, or are you only counting direct Q&A-style chats?

There are so many concepts I want to bet on but the resolution criteria is so vague

These are some super vague terms

How will you measure the AI market share for each?

To get some more clarity of this, since it seems vague right now, if googles llms are similar in quality, and maybe are even better for some multimodal applications, but they cut into the chatgpt market share by only 15%, how would this resolve?

Comment hidden
Comment hidden
© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules