Will an image generator be capable of asking clarifying questions about an ambiguous prompt by the end of 2024?

In my opinion, one of the major weaknesses of current LLM-based technology is that it doesn't ask the user clarifying questions when a prompt is ambiguous or otherwise confusing to the model.

Interactive text generators like ChatGPT probably could do it more if they were trained to do so, but I'm more concerned with models that perform a specific task like "generate an image" or "generate some music."

For example, if I ask a current image generator like DALL-E 2 or Stable Diffusion to generate an image of "A woman rescuing a drowning man with a robot arm" right now, it will give me four images with random permutations of women, men, and robots in the vicinity of some water. Compositionality problems aside, this prompt is actually linguistically ambiguous, and a competent artist would want to ask "Is it the woman or the man who has the robot arm?" before producing any artwork.

So, this market will resolve YES if, before the close date, there is a publicly-available image generator that asks the user for additional clarification in some way before generating the final images when prompted with "A woman rescuing a drowning man with a robot arm" (or a similarly ambiguous prompt, if that specific prompt doesn't work for some reason). Resolves NO otherwise. I will not be betting in this market.

Get Ṁ600 play money
Sort by:

DALLE 3's ChatGPT interface can do this, if you beat it over the head with it:

@HastingsGreer as you might guess from the 4/7, I had to cherry pick the hell out of its responses to get it to "understand" the problem.

Happened to see this tool mentioned today: https://github.com/AntonOsika/gpt-engineer

Looks like basically an AutoGPT-like script, but it does include a phase that identifies unclear points in the specification and generates a list of clarifying questions. Image generators work very differently from GPT, of course, so this strategy wouldn't be directly adaptable. (But maybe someone could write an AutoGPT script that rewrites image prompts and then passes them along to an image generator?)