Will Dall-E 3 reliably produce images that resemble MrUgleh's 4x4 grid image when given a single prompt?

Ṁ510Ṁ5.4k

resolved Oct 27

Resolved

ALL

I've seen an image at the bottom of a Twitter post that I couldn't replicate using MidJourney and a text-based prompt.

I'm curious if Dall-E 3 can reliably produce images resembling MrUgleh's 4x4 grid when given a single prompt.

By 'reliably,' I mean that in over half of the generations, at least one image should match the pattern or style shown.

By 'resembling,' I mean the generated image should feature the same 4x4 grid, as seamlessly integrated as in MrUgleh's example.

Anyone voting 'yes' can provide prompts. I'll run 5 generations for each. If any prompt meets the criteria, the market resolves to 'yes.'

Market context

OpenAI

dall-e

DALLE3

Get

1,000

to start trading!

🏅 Top traders

#	Trader	Total profit
1		Ṁ179
2		Ṁ25
3		Ṁ20
4		Ṁ17
5		Ṁ16

Sort by:

predictedNO

Resolves NO presumably @Soli

predictedNO

@chrisjbillington I will wait 24 hours for "Yes" holders to prove that is it possible. If no one submits anything that works then we can close as "No". Does this make sense for you?

predictedNO

@Soli sounds great!

predictedNO

@Soli @Aleph @HV0502 @Voidvamp

predictedYES

@Soli Not possible as of today

predictedNO

@Frogswap this is the best I was able to do haha which is still very bad but I think 2x2 could be possible.

I do now suspect this is possible with more refinement. This is as close as I've gotten so far (13 attempts): https://www.bing.com/images/create/medieval-village-scene-with-busy-streets-and-castl/651961d908d648ffb2a31c886a266858?id=vHabNLKDZwQiYfifxdJRLw%3d%3d&view=detailv2&idpp=genimg&FORM=GCRIDP&mode=overlay

Bing

Find high-quality images, photos, and animated GIFS with Bing Images

predictedNO

@Frogswap I'm impressed! That's already much further than I thought one would be able to get.

predictedYES

@Frogswap Some other examples from this and a previous attempt. These prompts seem sufficient to reliably prevent it from drawing a literal grid, which was the big problem with the first 11 versions. I'm a little stumped for how to pin down the difference between this and the goal, so I'll probably take a break.

edit: Ah, these link to the full set. Here's one link to the previous prompt instead.

https://www.bing.com/images/create/medieval-village-scene-with-busy-streets-and-castl/651960c984ea4d66b20f18e9e0c8af8a?id=n5CH%2bM5R%2bfFn4DW8Us76vg%3d%3d&view=detailv2&idpp=genimg&FORM=GCRIDP&mode=overlay

Bing

Find high-quality images, photos, and animated GIFS with Bing Images

predictedYES

@chrisjbillington You'll never guess who suggested the phrase "natural bisection" (it was GPT-4)

predictedYES

@Frogswap I don't know if anyone else is working on this, but I have a somewhat interesting result. It's still nowhere near the goal (and likely unreliable), but it is using the scene to draw two lines with a shadow, which is, in my experience, a big leap forward.

If anyone is updating on my slow progress, note that the Bing interface is miserable trash on mobile/firefox, and breaks for 20+ minutes every few generations. I've run about 50 generations since my last update.

Prompt:

"Scene: A medieval village, from a street-view perspective which creates an illusion of several linear, bidirectional bisections or spatial divisions (including in the landscape and sky), aligned as though on a head-on, flat, two-dimensional grid. Composition: The edges, shapes, and colors of the objects in the scene intrinsically constitute the overall composition. Natural bisections, both vertical and horizontal. Realism, masterpiece. Optical illusion art"

https://www.bing.com/images/create/scene3a-a-medieval-village2c-from-a-street-view-pers/651b0201f67d4ca2aa83dbbf444c4168?id=13NfpHXiO2HGDQC%2buYxSFA%3d%3d&view=detailv2&idpp=genimg&FORM=GCRIDP&ajaxhist=0&ajaxserp=0

Bing

Find high-quality images, photos, and animated GIFS with Bing Images

@Frogswap Alright, I give up 🙁

I would be surprised if you could do it with prompts alone. This sort of result tends to come from using something like ControlNet where you provide the high-level pattern to the AI as an image.

predictedNO

@Multicore Yes, I know that most of these images (🌀,🟦, etc.) are being generated using ControlNet. I don't see a reason why, at some point in the future, you wouldn't be able to generate the same images using StableDiffusion-based models with only text input. Also, theoretically speaking, there's nothing stopping OpenAI from utilizing multiple specialized image models and allowing ChatGPT to call the right sequence of models to generate the image.

predictedYES

Getting the obvious one out of the way, here's the prompt MrUgleh claims to have used (though I doubt this is the best approach):

Medieval village scene with busy streets and castle in the distance (masterpiece:1.4), (best quality), (detailed),

I won't submit any more until it's out and I can play with it a bit.

predictedNO

@Frogswap will try this one once the model is out although I think we both know it most probably won't work :D

You would accept any prompt given in the comments that consistently generates these images? I ask because I'm not sure about the pricing and allowed generations per time, and the natural exploit is to post as many prompts as possible to up the odds- you probably don't want to commit to that.

@Frogswap Yes. I updated the description now making it more precise. What do you think?

@Soli Looks good to me! I'm guessing you care more about actually getting the result than being stuck running prompts for a few days, which seems reasonable

🏅 Top traders

Related questions