Will DALL-E 3+ consistently generate accurate images from the prompt "yellow ocean, blue sand" before 2025?

resolved by

1kṀ13k

resolved Jun 13

Resolved

N/A

ALL

This market will resolve as 'Yes' if, before 2025, either I or, in case of my inactivity, a trustworthish individual, confirms that DALL-E generates 'yellow ocean, blue sand' images accurately in 9 out of 10 images.

Fine-Print Details

This applies regardless of the model's access method (chat, UI, API, etc.), as long as it is provided by OpenAI. This means it is accessible either through an app released by them or under the openai.com domain. The resolution process adapts to the interface used for testing. It is sufficient if any available access method resolves to 'Yes.'
The evaluation standard is based on a clear, casual observation of predominantly yellow ocean and blue sand, with minor color imperfections allowed (less than 5%).
Before betting on this market, please check my previous market for this prompt. The market resolved as 'No' after I gained access to Dall-E 3."

OpenAI

ChatGPT

AI Image Generation Testing

AI Image Generation

DALLE3

Get

1,000

to start trading!

People are also trading

Will there be an open source, uncensored AI image generator with the same or greater quality as DALLE-3 by end of 2025?

90% chance

Will an AI generate the best optical illusion of any year before 2030?

Sort by:

@Soli, does GPT-4o generated images count (not Dalle)?

@Hazel yes

bought Ṁ2,000 YES1y

@Soli Perfect :)

@Soli can you clarify this? The market is very explicitly about DALL-E, not other models.

Soli already clarified.

Dall E is the tech behind 4o image generation, right?

I still don't get the recent buy up. I don't think I've gotten a single image to pass the criteria here.

@Hazel I'm asking for clarification because I think Soli was confused about something. For example, the standard way to get DALL-E images right now is to ask ChatGPT to send a prompt word-word-word to DALL-E. Soli probably, I think, just meant to clarify that you can do it that way rather than direct DALL-E access.

opened a Ṁ500 NO at 94% order1y

@Mactuary GPT-4o as accessed through the ChatGPT interface currently calls DALL-E to generate images. OpenAI has stated that GPT-4o can generate images, but that has not been rolled out yet.

@Jacy it is true that GPT-4o's native image capability hasn't been rolled out yet, but I believe it should count once it is rolled out. The title of the question said "Dall-E 3+" and the description states:

This applies regardless of the model's access method (chat, UI, API, etc.), as long as it is provided by OpenAI. This means it is accessible either through an app released by them or under the openai.com domain. The resolution process adapts to the interface used for testing. It is sufficient if any available access method resolves to 'Yes.'

@Soli I don't see how the quoted text supports the highly counterintuitive notion that "System A 3+" should include "System B". The quoted text just clarifies that any access method for System A through OpenAI is sufficient. I'm not a disinterested mod here, but if I were, I would definitely consider a YES resolution based on System B (i.e., GPT-4o's native image capability) to be a misresolution. Both the question text and the resolution text specify that this is about System A (i.e., DALL-E).

@Jacy i closed the question for now, will respond later today

@Jacy i admit that i could have done a better job writing a resolution criteria for this question but i specifically added the symbol + after Dall-E 3 to make it clear that whatever follows Dall-E also counts as long as it comes from openai. the fact that the system succeeding dall-e is not also called dall-e should not influence the resolution of this market imo since we are interested in exploring ai text-to-image capabilities and not the naming of the models

@chrisjbillington what do you think? (i always enjoy reading your opinion and you tend to make very good points)

@Soli It's a stretch to call GPT-4o the "successor" to DALL-E. The GPT and DALL-E models are fundamentally quite different (e.g., multimodal vs image; as far as is public knowledge, transformers vs diffusion) with very different strengths and weaknesses. My predictions, and presumably others, were about the DALL-E models in particular. I think if it were something like "PALL-E," a system based on "probabilistic" diffusion (not a real thing) but otherwise filling a similar role to DALL-E (e.g., as an API called by other OpenAI models to generate images) and displacing DALL-E 4 in the expected version history, this sort of case for the interpretation of the "+" sign would be tenable.

And my bets are all based on @Soli‘s clarification. I didn’t bet until I received clarification because I was unsure whether “+” was referring to other successors of DALL-E like GPT-4o.

Now we know, and bets have been made.

You had every chance to ask for clarification as well before making your bets.

@Jacy i get your point, let’s just n/a this thing then @mods can someone please n/a? thank you!

@Hazel @Jacy i don't think we should fight over this one. i made a mistake when i created the market. let's just n/a please.

apologies for wasting ur time @Jacy and @Hazel

@Soli I don't see this as a waste of time. N/A is a disappointing resolution because I had a lot invested here and some loans on shares that I think were very valuable because of the high likelihood of a NO resolution. And I think it's very unlikely that this GPT-4o vs. DALL-E distinction would have ended up mattering. But I guess that's your prerogative.

Thanks for the discussion.

@Hazel everyone on Manifold is betting on their own interpretations. That doesn't usually mean people are protected from the losses from bets based on mistaken interpretations—though sometimes in cases like this, an N/A will happen, which effectively is that sort of protection.

I didn’t bet on my own interpretation… I bet on grounded data based on the question I asked.

repostedpredictedYES 1y

For reference a very similar market resolved yes

you've heard of 'Piss Christ', now its time for...

predictedYES 1y

@strutheo whaaaaat?

The model is now able to generate images matching the prompt “blue grass, green sky” so this should just be a matter of time

What is this?

What is Manifold?

Manifold is the world's largest social prediction market.

Get accurate real-time odds on politics, tech, sports, and more.

Or create your own play-money betting market on any question you care about.

Are our predictions accurate?

Yes! Manifold is very well calibrated, with forecasts on average within 4 percentage points of the true probability. Our probabilities are created by users buying and selling shares of a market.

In the 2022 US midterm elections, we outperformed all other prediction market platforms and were in line with FiveThirtyEight’s performance. Many people who don't like betting still use Manifold to get reliable news.

Why use play money?

Mana (Ṁ) is the play-money currency used to bet on Manifold. It cannot be converted to cash. All users start with Ṁ1,000 for free.

Play money means it's much easier for anyone anywhere in the world to get started and try out forecasting without any risk. It also means there's more freedom to create and bet on any type of question.

People are also trading

Will there be an open source, uncensored AI image generator with the same or greater quality as DALLE-3 by end of 2025?

90% chance

Will an AI generate the best optical illusion of any year before 2030?

51% chance

What is this?

What is Manifold?

Manifold is the world's largest social prediction market.

Get accurate real-time odds on politics, tech, sports, and more.

Or create your own play-money betting market on any question you care about.

Are our predictions accurate?

Yes! Manifold is very well calibrated, with forecasts on average within 4 percentage points of the true probability. Our probabilities are created by users buying and selling shares of a market.

Why use play money?

Mana (Ṁ) is the play-money currency used to bet on Manifold. It cannot be converted to cash. All users start with Ṁ1,000 for free.

People are also trading

What is this?

People are also trading

What is this?

Related questions