Resolves if the model could reasonably described as being released in 2023. Feel free to posit grey area release scenarios and we can reach a consensus.
List of events qualify to resolve this market YES:
Released to the public
Open Beta
Paid but otherwire open service
Leaked weights that have been packaged to be usable by laymen
Renamed to something else (in name only) but released
Closed Beta with proof of >=10,000 users
List of events that do not qualify to resolve this market YES:
Leaked weights that are useless
Closed Beta with no proof of user numbers
Closed Beta with <10,000 users
Limited researcher only release
List of events that immediately resolve this market NO:
Project renamed and takes vastly different direction
Project cancelled
Not released by 2024
Model explicitly permanently not released due to risks
Also:
When coming up with all the possible outcomes, being split into a "family" with some being released this year didn't come up. So I'll default to using the very first sentence of the description, which I take as the "spirit" of the question: "Resolves if the model could reasonably described as being released in 2023."
I don't think it's unreasonable to say that Gemini has been released in 2023, even though now there are now some caveats.
Also, "Renamed to something else (in name only) but released" was also a YES criteria, which seems close (though different) to this scenario.
Reading through the paper, it seems to me that the family of models is closely related in architecture and training techniques, only differing in scale, and 2/3 of them coming out now makes a CANCEL or NO resolution seem incorrect.
Lastly, the market seems to have coalesced around YES, which is, after all, what this website is all about. I'm sure this will make some people angry, but so would any other resolution.
@brubsby Just to comment on your last sentence because that seems particularly important, I think resolving markets based on how people have voted or "coalesced" is very bad for the epistemics of a website like Manifold. It incentivizes all sorts of bad behavior, like people bandwagoning just to persuade the market creator to resolve how they want.
@brubsby On this specifically:
Also, "Renamed to something else (in name only) but released" was also a YES criteria, which seems close (though different) to this scenario.
Hm, I would have thought the logic pointed in the opposite direction. The resolution criteria you wrote was establishing that it is the thing that matters, not the name of the thing. So when we learn of a thing that shares the name but otherwise wasn't what we expected, I'd have guessed you might have not been inclined to consider it the thing this market was about.
Hmm, I personally was reading the question as "Google's most advanced model that aims to beat GPT-4", and was predicting NO based on thinking that the safety checking would take longer—which is indeed the reason why Ultra isn't released now. It seems that Ultra is probably the thing that everyone has been waiting for, not Pro (though I haven't experimented with it a lot yet).
I wonder how others read the question?
The issue is that this is only a partial release. According to Google, Gemini contains Gemini Pro, Gemini Ultra, and Gemini Nano. Only Gemini Pro was released today and others aren't expected until 2024. So the resolution of this market is ambiguous because "release" doesn't specify whether that's a full or partial release.
@Jacy On the flipside, it's hard to argue that “Google didn't release a model with the name 'Gemini' in 2023“ if Bard is now using a model called 'Gemini Pro'.
That there are other versions of Gemini coming in the future doesn't change this.
(I agree that it's ambiguous as to what OP's intent was with the question. But that's hard to argue & just depends on OP.)
@JonathanMannhart If the market said, "Will Google release a model with Gemini in its name in 2023?" I think it would be unambiguously YES. If the market just said, "Will Google release a model named Gemini in 2023?" I think it'd still be ambiguous because it's an ambiguity of "partial name" or "complete name," but perhaps there would be a more intuitive pull towards YES than the actual wording.
Starting today, Bard will use a fine-tuned version of Gemini Profor more advanced reasoning, planning, understanding and more. This is the biggest upgrade to Bard since it launched.
@JoelBecker Gemini Ultra not yet available which is lame:
>For Gemini Ultra, we’re currently completing extensive trust and safety checks, including red-teaming by trusted external parties, and further refining the model using fine-tuning and reinforcement learning from human feedback (RLHF) before making it broadly available.
@JoelBecker Ugh. With all of the many possible resolution criteria mentioned in the description, "Released, but only a partial/nerfed version, but still named 'Gemini'" is not among them. I expect there will be disagreements about the resolution no matter how this gets resolved. Do better, google.
@DanielParker Right, I'm not sure why this is so high. The question is worded as if Gemini is a single model, but Google's phrasing is that "Gemini" just includes "Gemini Pro," and only "Gemini Pro" is coming out this month. I think there's a tenable case for YES, NO, NA, and 50%.
@Jacy @DanielParker huh? It seems unambiguous to me that a model originating from the Gemini pre-trained model is Gemini for purposes of this question, so YES is appropriate resolution. Would you not have counted it if Gemini was only 1 model but was pruned or used a post-pre-training procedure you didn't anticipate?
@JoelBecker that hypothetical doesn't seem analogous to me, but I may be misunderstanding. Common usage of "model" in AI implies a specific input-output function (usually a single neural network, but it could be a mixture of experts). Pruning and fine-tuning can be part of model creation, right?
E.g., BERT and DistilBERT have for years been consistently referred to as different models.
@Jacy Llama 2 is referred to as a model, but even the weakest definition would include 3 input-output functions.
@JoelBecker Sure, that's why I said "common." There are exceptions to common usage; I'm sure you could find many papers referring to all sorts of things as models.
But I'm still not grasping your argument if you were intending to make one. Are you saying that the question is not "worded as if Gemini is a single model" because believing that would imply believing it is also "worded as if Gemini is trained exactly how we anticipated"? That doesn't seem like a plausible implication to me. In other words, I'm not seeing how your, "It seems unambiguous..." claim is justified by the question you pose.
They will 100% be fully released during GoogleIO!
I'm willing to bet the house on this
Problem is, there is proof that there is sort of a closed BETA going around, so it's possible than more than 10k people will be able to access it.
The other problem is that you may not have any way to figure out how many people tried the closed beta by the time this closes in 1 jan 2024.
So what happens if you don't have figures on how many people had access to the closed beta? Does this resolve no since we don't know or does this still resolves yes?
@THEWINNER you think they'll wait until May to release a product they originally promised this year and have already had in beta for weeks?
@ErickBall the source is an interview where Hassabis is asked "Likely delayed to 2024 then?" and he kind of grins and says "We'll see."
Still pretty ambiguous.