Which new AI models will be released in February 2025?
🔮
Crystal
147
Ṁ670k
Mar 1
39%
Open AI o3
11%
OpenAI video generation
21%
OpenAI image generation
33%
Anthropic flagship language model**
36%
Anthropic reasoning language model***
27%
Anthropic (other)****
29%
Midjourney
45%
Microsoft
22%
Amazon language model
87%
XAI language model
22%
XAI image or video generation
30%
Deepseek language model
26%
Mistral language model

Released = available to some portion of the public (including a subset of subscribers or a limited number of API developers from members of the public). Released only for safety testing does not count.

New model = Either announced by the company as a new model, is clear from numbering/naming it is a distinct model, or able to be selected from some sort of menu as a distinct model. Something like "o1 extra mini" would count as while it is part of o1 it can be considered a distinct model in this market.

Must be publically released for the first time between February 1st 00:00am PST and February 28th 11:59pm PST. If it is announced but not yet released to any members of the public it will not count.

For answers where no specific model type is specified alongside the company, then any type of generative AI model will cause it to resolve yes.

*OpenAI (other) refers to any model that is not their new flagship model (eg. GPT 5), o3, a video generator, or an image generator. It could be a derivative of another language model or some other type of model such as a voice generator.

**Anthropic flagship language model refers to a model comparable to claude 3.5 or gpt-4o that should outperform claude 3.5 sonnet on a majority of performance benchmarks. This should not be a reasoning model.

***Anthropic reasoning model refers to a model that is not considered their everyday task model and is akin to what OpenAI's O1 is to gpt-4o.

****Anthropic (any other) refers to any model that is not a reasoning model nor their new flagship model. For example, it could be a derivative of an existing language model or a different type of AI model entirely.

Get
Ṁ1,000
and
S3.00
Sort by:

Why would XAI release? Their models are DS

Can you add the company Cohere?

@ikoukas Can't add more options to this existing market but will consider it for the March one!

I've tentatively resolved the Meta answer to 'yes' (can't close individually answers or I would have done that).

But, I might unresolve after further research as from what I can tell the new Meta AI models are not generative models as the description requires.

@Manifold @MingCat @bagelfan @summer_of_bliss thoughts as the biggest Yes holders?

@Manifold I think you should ask the relatively biggest Yes holders: Those who have the highest % of their balance invested.

@Manifold I'm not certain. But am I mistaken, or does the description not actually specify a generative model?
EDIT: Nevermind, I see it now

@Manifold oh i read it as satisfying the “Either announced by the company as a new model, is clear from numbering/naming it is a distinct model, or able to be selected from some sort of menu as a distinct model.” part of the criteria.

the later part, “For answers where no specific model type is specified alongside the company, then any type of generative AI model will cause it to resolve yes.”, to me read as a sufficient condition for resolution, not a necessary condition

@Manifold I was under the impression that the meta release would count as a generative model for the purposes of the description

@summer_of_bliss I don't think that reading of the description makes much sense to me? The description starts with rules for options where the model is already specified (e.g. "reasoning language model", "O3 mini", etc). Then, for cases where the model isn’t specified, it explains what types of models count: “For answers where no specific model type is specified alongside the company, then any type of generative AI model will cause it to resolve yes.” The alternative would be that (1) that quoted line is meaningless (cut it from the description, & nothing changes), & (2) the question never gives any indication of what models count? (which isn't really how David has been writing these).

(If this model is generative AI then it should definitely resolve YES! I have no idea what it does, & I haven't looked, lots of non-generative output uses gen AI under the hood. But I don't think it makes any sense to exclude that line from the description. And if the intent is to allow any "AI model" count, with no definition provided, then the description should be rewritten.)

@Ziddletwix Thoughtful response, going point by point:

You say “The description starts with the rules for options where the model is already specified”. It doesn’t? The description just defines the term “New model”. Which would apply to every option in this market, surely? Specifically it says:

"New model = Either announced by the company as a new model, is clear from numbering/naming it is a distinct model, or able to be selected from some sort of menu as a distinct model. Something like "o1 extra mini" would count as while it is part of o1 it can be considered a distinct model in this market.”

Can you point out where in this definition it restricts this definition only to options where the specific model/name is already specified? I just don’t see where, I read this as clearly applying to every option in the market.

Then about the “For answers where no specific model […]” paragraph you say that under my interpretation either 1) that quoted line is meaningless/superfluous AND 2) the question never gives any indication of which models count. I disagree on both. The question DOES give an indication of which models count, in the earlier paragraph you just quoted - “Either announced by the company as a new model […]”. Surely you agree these count as criteria?

Now on the first point - suppose I make a market on ‘Will RFK be confirmed by the senate’. I say ‘resolves YES if RFK is confirmed. Resolves no if RFK is not confirmed, dies, or withdraws before being confirmed’. In some sense, that second line is “meaningless” in the sense you’ve just used it. You could cut it out and the criteria are the same, sure. But it acts as clarification. This is incredibly common in prediction markets and was how I read the “generative models would count” line in these criteria.

I would agree that this market should not resolve yes if the description were phrased as “for answers where no specific model type is specified, then ONLY generative models would cause it resolve yes”. But that’s not what it says. It’s phrased as a sufficient, not necessary condition, and so should resolve yes

(This is exactly why I bet M1000 yes when the market was at like 33% - the criteria for these questions where no model is specified, just the company, is incredibly broad as written and thus very likely to resolve yes.)

opened a Ṁ5,000 YES at 49% order

Today Meta released audiobox-aesthetics model - "Unified automatic quality assessment for speech, music, and sound".

https://github.com/facebookresearch/audiobox-aesthetics

https://ai.meta.com/blog/machine-intelligence-research-new-models/

sold Ṁ164 NO

Unsure if this involves any new models or not:

@MingCat It doesn’t, was released in jan

@Bayesian ah, thanks!

@Manifold The sweepstakes version could use more subsidy

@bagelfan i dunno i wouldn't say this is a great candidate market for subsidy, given that for any YES resolution all of it will eventually be taken by some news trader. (whereas many sweeps markets can be closed early or etc to use it efficiently)

@Ziddletwix agree that this isn't the most efficient use of subsidy, but 50 sweepcash shared among so many answers is definitely insufficient for what we want to be one of our biggest markets.

@Manifold oh fair i don't have sweeps enabled so i didn't see how tiny it was (i figured per-option it was more like the nba markets, definitely the current amount is way too little)

https://blog.google/technology/google-deepmind/gemini-model-updates-february-2025/

Gemini 2.0 Pro Experimental and 2.0 Flash-Lite released.

filled a Ṁ1,000 YES at 99.0% order

February 5th artifacts are up on Vertex

R.e. GDM market - unclear it should resolve given the model being added to the Gemini consumer application seems to only be the December 6th preview

Dylan Patel repeats claim about Anthropic having a better reasoning model than o3: https://x.com/mark_k/status/1886769660344877073

We won't be resolving o3 from the release of Deep Research.

It is an agent that uses a fine-tuned version of o3 (thus fulfilling the OpenAI (other) criteria of being a derivative of another language model). However, the model known as o3 still isn't directly usable or released to the public.

@Manifold Deep Research doesn't fulfill your requirements to resolve as "other". Other models must be totally distinct models or be based on an expansion of an already existing and released model. You wrote "Something like "o1 extra mini" would count as while it is part of o1 it can be considered a distinct model in this market." This would make deep research resolve as "other" if and only if another version of o3 had already been released, which was not the case. Deep Research is not an expansion of a released model, it is the only model that we have right now for o3. When OpenAI releases another version of o3, only then the market can resolve as "other". You cannot have another model if you don't have the base model before.

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules