Will DALL-E 3+ consistently generate accurate images from the prompt "yellow ocean, blue sand" before 2025?
Basic
35
13k
resolved Jun 13
Resolved
N/A

This market will resolve as 'Yes' if, before 2025, either I or, in case of my inactivity, a trustworthish individual, confirms that DALL-E generates 'yellow ocean, blue sand' images accurately in 9 out of 10 images.

Fine-Print Details

  • This applies regardless of the model's access method (chat, UI, API, etc.), as long as it is provided by OpenAI. This means it is accessible either through an app released by them or under the openai.com domain. The resolution process adapts to the interface used for testing. It is sufficient if any available access method resolves to 'Yes.'

  • The evaluation standard is based on a clear, casual observation of predominantly yellow ocean and blue sand, with minor color imperfections allowed (less than 5%).

  • Before betting on this market, please check my previous market for this prompt. The market resolved as 'No' after I gained access to Dall-E 3."

Get Ṁ600 play money
Sort by:

@Soli, does GPT-4o generated images count (not Dalle)?

@Hazel yes

bought Ṁ2,000 YES

@Soli Perfect :)

@Soli can you clarify this? The market is very explicitly about DALL-E, not other models.

Soli already clarified.

Dall E is the tech behind 4o image generation, right?

I still don't get the recent buy up. I don't think I've gotten a single image to pass the criteria here.

@Hazel I'm asking for clarification because I think Soli was confused about something. For example, the standard way to get DALL-E images right now is to ask ChatGPT to send a prompt word-word-word to DALL-E. Soli probably, I think, just meant to clarify that you can do it that way rather than direct DALL-E access.

opened a Ṁ500 NO at 94% order

@Mactuary GPT-4o as accessed through the ChatGPT interface currently calls DALL-E to generate images. OpenAI has stated that GPT-4o can generate images, but that has not been rolled out yet.

@Jacy it is true that GPT-4o's native image capability hasn't been rolled out yet, but I believe it should count once it is rolled out. The title of the question said "Dall-E 3+" and the description states:

This applies regardless of the model's access method (chat, UI, API, etc.), as long as it is provided by OpenAI. This means it is accessible either through an app released by them or under the openai.com domain. The resolution process adapts to the interface used for testing. It is sufficient if any available access method resolves to 'Yes.'

@Soli I don't see how the quoted text supports the highly counterintuitive notion that "System A 3+" should include "System B". The quoted text just clarifies that any access method for System A through OpenAI is sufficient. I'm not a disinterested mod here, but if I were, I would definitely consider a YES resolution based on System B (i.e., GPT-4o's native image capability) to be a misresolution. Both the question text and the resolution text specify that this is about System A (i.e., DALL-E).

@Jacy i closed the question for now, will respond later today

@Jacy i admit that i could have done a better job writing a resolution criteria for this question but i specifically added the symbol + after Dall-E 3 to make it clear that whatever follows Dall-E also counts as long as it comes from openai. the fact that the system succeeding dall-e is not also called dall-e should not influence the resolution of this market imo since we are interested in exploring ai text-to-image capabilities and not the naming of the models

@chrisjbillington what do you think? (i always enjoy reading your opinion and you tend to make very good points)

@Soli It's a stretch to call GPT-4o the "successor" to DALL-E. The GPT and DALL-E models are fundamentally quite different (e.g., multimodal vs image; as far as is public knowledge, transformers vs diffusion) with very different strengths and weaknesses. My predictions, and presumably others, were about the DALL-E models in particular. I think if it were something like "PALL-E," a system based on "probabilistic" diffusion (not a real thing) but otherwise filling a similar role to DALL-E (e.g., as an API called by other OpenAI models to generate images) and displacing DALL-E 4 in the expected version history, this sort of case for the interpretation of the "+" sign would be tenable.

And my bets are all based on @Soli‘s clarification. I didn’t bet until I received clarification because I was unsure whether “+” was referring to other successors of DALL-E like GPT-4o.

Now we know, and bets have been made.

You had every chance to ask for clarification as well before making your bets.

@Jacy i get your point, let’s just n/a this thing then @mods can someone please n/a? thank you!

@Hazel @Jacy i don't think we should fight over this one. i made a mistake when i created the market. let's just n/a please.

apologies for wasting ur time @Jacy and @Hazel

@Soli I don't see this as a waste of time. N/A is a disappointing resolution because I had a lot invested here and some loans on shares that I think were very valuable because of the high likelihood of a NO resolution. And I think it's very unlikely that this GPT-4o vs. DALL-E distinction would have ended up mattering. But I guess that's your prerogative.

Thanks for the discussion.

@Hazel everyone on Manifold is betting on their own interpretations. That doesn't usually mean people are protected from the losses from bets based on mistaken interpretations—though sometimes in cases like this, an N/A will happen, which effectively is that sort of protection.

I didn’t bet on my own interpretation… I bet on grounded data based on the question I asked.

repostedpredicted YES

For reference a very similar market resolved yes

you've heard of 'Piss Christ', now its time for...

predicted YES

@strutheo whaaaaat?

The model is now able to generate images matching the prompt “blue grass, green sky” so this should just be a matter of time