Will Google mostly catch up to OpenAI in LLM quality and neutralize ChatGPT's lead by the end of 2024?
➕
Plus
749
Ṁ160k
Jan 1
38%
chance

[This is Casey's medium-confidence prediction from the 12/22/23 episode of Hard Fork]

This market will resolve to yes if Google's Gemini Ultra or other state-of-the-art LLM is roughly equivalent to OpenAI's best publicly available LLM on December 31, 2024, and if Google's AI products have cut into ChatGPT's share of the consumer LLM chatbot market. Otherwise, it will resolve to no.

Get
Ṁ1,000
and
S3.00
Sort by:

The resolution cases are way too subjective

bought Ṁ750 NO

and the market creator seems gone. hmMmMmMm

oh he's on the hard fork thing! much more likely it'll resolve in that case

@KevinRoose how would you resolve this today if you had to and why?

If openai do not submit o1 pro to leaderboards themselves, we’ll have to compare by published benchmark performance like on MATH etc., right? Because o1 and o1 pro won’t have api access by Jan 1

getting serious out there

@figo performance is even but "and if Google's AI products have cut into ChatGPT's share of the consumer LLM chatbot market." could go either way

@IsaacLiu I know, I've divested a lot for this reason but it's fun to watch

opened a Ṁ5,000 NO at 45% order

@IsaacLiu how could that go either way? Gemini has dropped from 16 to 13 yoy, seems pretty hard to imagine a recovery now

idk but NotebookLM is pretty good imo

sold Ṁ28 NO

Spelling off my position, as I have no idea how this will resolve. It seems like they are very close on capabilities but behind quite a bit on popularity and the criteria are too vague for my liking.

Gemini market share is falling, this is a sure no

https://firstpagesage.com/reports/top-generative-ai-chatbots/

It seems Gemini is "roughly equivalent" right now according to Chatbot Arena, but I'm not sure where to find numbers on ChatGPT's share of the chatbot market.

@JasonDavies Hard to tell about how to compare given Google doesn't have something like o1?

I mean, obviously that's not the top of the raw leaderboard here, but it does seem to be a potential significant lead for OpenAI in an approach the might lead to future scaling?

ChatGPT still has vastly more market share, but criteria are very vague here, which makes the bets less illuminating.

I think their models are clearly basically as good and have been for a while, but Google is bad at getting consumers to use them. The resolution criteria are really vague, though, so I am not going to bet much on this.

REMINDER: This cannot resolve until January 1, per question wording, it depends on the situation at year's end.

bought Ṁ2,000 YES

Seems like Gemini pro is #1 right now. Resolve to YES?

Does this resolution go by Hard Fork's own decision?

If instead judged by @KevinRoose18ac, is this strictly about chatbots, or LLM's in general? For example, you could see Gemini being ~natively present in lots of Android phones, putting them at roughly equal footing with OpenAI's iOS play. But when it comes to chatbots, as in visiting chatGPT.com, I think OpenAI wins by a landslide.

Part of it will depend on what was meant by "cut into ChatGPT's share", I think. If that earlier report was right and Gemini's web interface is at 25% of ChatGPT, does that count? Or does it have to reach 51% to count?

bought Ṁ100 NO

Why is this so high? Manifold has GPT-5 at 80% for a 2024 release

Gemini 2.0

At this point, OpenAI's best model on the LMSYS leaderboard beats Google's best model 50.66% of the time; pretty close to a coinflip. Along with Gemini traffic at ~25% of ChatGPT's traffic, it seems like if this resolved today, it would resolve YES.

Obviously, the market is actually about the state of things on December 31st, so that's not dispositive, but I wonder if 40% is the right place for this to be sitting right now?

@ChrisPrichard OpenAI has not released a major model (GPT-4.5 or GPT-5) in 2024, and they say that they will. Google has caught up to GPT-4, but can they beat whatever the new one is? My guess would be no.

@dominic Yeah - that does seem to be the major uncertainty here!

Comment hidden
© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules