Will Dall-E 3 reliably produce images that resemble MrUgleh's 4x4 grid image when given a single prompt?
Mini
28
5.4k
resolved Oct 27
Resolved
NO

I've seen an image at the bottom of a Twitter post that I couldn't replicate using MidJourney and a text-based prompt.


I'm curious if Dall-E 3 can reliably produce images resembling MrUgleh's 4x4 grid when given a single prompt.

By 'reliably,' I mean that in over half of the generations, at least one image should match the pattern or style shown.

By 'resembling,' I mean the generated image should feature the same 4x4 grid, as seamlessly integrated as in MrUgleh's example.


Anyone voting 'yes' can provide prompts. I'll run 5 generations for each. If any prompt meets the criteria, the market resolves to 'yes.'

Get Ṁ1,000 play money

🏅 Top traders

#NameTotal profit
1Ṁ179
2Ṁ25
3Ṁ20
4Ṁ17
5Ṁ16
Sort by:
predicted NO

Resolves NO presumably @Soli

predicted NO

@chrisjbillington I will wait 24 hours for "Yes" holders to prove that is it possible. If no one submits anything that works then we can close as "No". Does this make sense for you?

predicted NO

@Soli sounds great!

predicted NO
predicted YES

@Soli Not possible as of today

predicted NO


@Frogswap this is the best I was able to do haha which is still very bad but I think 2x2 could be possible.

I do now suspect this is possible with more refinement. This is as close as I've gotten so far (13 attempts): https://www.bing.com/images/create/medieval-village-scene-with-busy-streets-and-castl/651961d908d648ffb2a31c886a266858?id=vHabNLKDZwQiYfifxdJRLw%3d%3d&view=detailv2&idpp=genimg&FORM=GCRIDP&mode=overlay

predicted NO

@Frogswap I'm impressed! That's already much further than I thought one would be able to get.

predicted YES

@Frogswap Some other examples from this and a previous attempt. These prompts seem sufficient to reliably prevent it from drawing a literal grid, which was the big problem with the first 11 versions. I'm a little stumped for how to pin down the difference between this and the goal, so I'll probably take a break.

edit: Ah, these link to the full set. Here's one link to the previous prompt instead.

https://www.bing.com/images/create/medieval-village-scene-with-busy-streets-and-castl/651960c984ea4d66b20f18e9e0c8af8a?id=n5CH%2bM5R%2bfFn4DW8Us76vg%3d%3d&view=detailv2&idpp=genimg&FORM=GCRIDP&mode=overlay

predicted YES

@chrisjbillington You'll never guess who suggested the phrase "natural bisection" (it was GPT-4)

predicted YES

@Frogswap I don't know if anyone else is working on this, but I have a somewhat interesting result. It's still nowhere near the goal (and likely unreliable), but it is using the scene to draw two lines with a shadow, which is, in my experience, a big leap forward.

If anyone is updating on my slow progress, note that the Bing interface is miserable trash on mobile/firefox, and breaks for 20+ minutes every few generations. I've run about 50 generations since my last update.

Prompt:

"Scene: A medieval village, from a street-view perspective which creates an illusion of several linear, bidirectional bisections or spatial divisions (including in the landscape and sky), aligned as though on a head-on, flat, two-dimensional grid. Composition: The edges, shapes, and colors of the objects in the scene intrinsically constitute the overall composition. Natural bisections, both vertical and horizontal. Realism, masterpiece. Optical illusion art"

https://www.bing.com/images/create/scene3a-a-medieval-village2c-from-a-street-view-pers/651b0201f67d4ca2aa83dbbf444c4168?id=13NfpHXiO2HGDQC%2buYxSFA%3d%3d&view=detailv2&idpp=genimg&FORM=GCRIDP&ajaxhist=0&ajaxserp=0

@Frogswap Alright, I give up 🙁

I would be surprised if you could do it with prompts alone. This sort of result tends to come from using something like ControlNet where you provide the high-level pattern to the AI as an image.

predicted NO

@Multicore Yes, I know that most of these images (🌀,🟦, etc.) are being generated using ControlNet. I don't see a reason why, at some point in the future, you wouldn't be able to generate the same images using StableDiffusion-based models with only text input. Also, theoretically speaking, there's nothing stopping OpenAI from utilizing multiple specialized image models and allowing ChatGPT to call the right sequence of models to generate the image.

predicted YES

Getting the obvious one out of the way, here's the prompt MrUgleh claims to have used (though I doubt this is the best approach):

Medieval village scene with busy streets and castle in the distance (masterpiece:1.4), (best quality), (detailed),

I won't submit any more until it's out and I can play with it a bit.

predicted NO

@Frogswap will try this one once the model is out although I think we both know it most probably won't work :D

You would accept any prompt given in the comments that consistently generates these images? I ask because I'm not sure about the pricing and allowed generations per time, and the natural exploit is to post as many prompts as possible to up the odds- you probably don't want to commit to that.

@Frogswap Yes. I updated the description now making it more precise. What do you think?

@Soli Looks good to me! I'm guessing you care more about actually getting the result than being stuck running prompts for a few days, which seems reasonable