Current image models are terrible at this. (That was tested on DALL-E 2, but DALL-E 3 is no better.)
The image model must get the correct number of sides on at least 95% of tries per prompt. Other details do not have to be correct. Any reasonable prompt that the average mathematically-literate human would easily understand as straightforwardly asking it to draw a pentagon must be responded to correctly. I will exclude prompts that are specifically trying to be confusing to a neural network but a human would get. Anything like "draw a pentagon", "draw a 5-sided shape", "draw a 5-gon", etc. must be successful. Basically I want it to be clear that the AI "understands" what a pentagon looks like, similar to how I can say DALL-E understands what a chair looks like; it can correctly draw a chair in many different contexts and styles, even if it misunderstands related instructions like "draw a cow sitting in the chair".
If the input is fed through an LLM or some other system before going into the image model, this pre-processing will be avoided if I can easily do so, and otherwise it will not. If the image model is not publicly available, I must be confident that its answers are not being cherry-picked.
Pretty much neural network counts, even if it's multimodal and can output stuff other than images. A video model also counts, since video is just a bunch of images. I will ignore any special-purpose image model like one that was trained only to generate simple polygons. It must draw the image itself, not find it online or write code to generate it. File formats that are effectively code, like an SVG don't count either; it has to be "drawing the pixels" itself.
Gemini Advanced now has access to Imagen3, which created this for me. Here's the entire conversation:
Me: Do you have access to Imagen3?
G: Yes, I do! I can generate images with Imagen 3 now. Would you like me to create an image for you? I can generate images from text descriptions, or even make variations on an image you provide.
Me: Yes, please create an image of the geometrical shape called a pentagon.
G: Sure, here is an image of a pentagon:
Me: Can you recreate that without any color?
G: Of course! here you go!
Please let me know if there's anything else you need to resolve this market.
@WilliamGunn Read the description and comments again and you’ll see why this is insufficient to resolve
@JimHays I wouldn't want to accuse you of a failure of reading comprehension (such rudeness can be left to social media interactions), but since you didn't mention it and I'm quite sure it's important, did you use explicitly invoke Imagen3? I had to do that, it didn't happen by default.
@JimHays Try the following prompt: "Can you use imagen3 to create an image of the geometrical shape called a pentagon?" This is what I got, which makes it 3/3 for me. I don't feel like it should be me doing all the prompting though. @IsaacKing are you planning to do some more testing before resolving this?
@WilliamGunn I've had good success with that prompt, but not with other prompts that would need to succeed as well for a YES resolution here
@chrisjbillington I regret ever engaging in this market. The question should be renamed to "Is it possible to find a prompt that will cause a model capable of drawing pentagons to fail to do so?"
@WilliamGunn I agree that the question does not match the description, which is why I was recommending that you further investigate the description and the clarifications in the comments below. I would maybe have recommended something like “Before 2025, will there be an image model that very reliably draws pentagons for all reasonable prompts?”
@ProjectVictory I used https://aitestkitchen.withgoogle.com/tools/image-fx
It should be available via Gemini, but anyways the question just said any image model. Fine if you don't want to accept the embellished pentagon, but we're clearly pretty close!
@WilliamGunn I agree that seems to satisfy the criteria. The results are ornate but it’s consistently giving me either a pentagon for “pentagon shape” (otherwise it gets confused with The Pentagon, which is fair). It seems to know how to draw a pentagon.
@WilliamGunn I guess the issue is that 95% is a very high threshold. I tried it and got like 17 pentagons out of 22 images, which I think is pretty good but isn’t enough for this market.
@jbca The bar is also pretty high compared to a lot of other "will AI do blah" markets in that the model must respond correctly to a broad range of prompts, including e.g. "draw a 5-gon" as mentioned in the criteria. Definitely progress for a model to be approaching the required accuracy rate for a specific prompt, though.
I clicked the link and wasn't able to get pentagons after a few tries. Does one have to do anything to select a specific model or something?
@chrisjbillington My first attempt, something like “draw a five-sided polygon” only got 1/4 that might have been considered a pentagon. The others were a hexagon, a 5-pointed star, and some harder to describe 3D shape
@jbca In addition to the 95% threshold, there are very difficult-for-AI prompts that fall under the category, "Any reasonable prompt that the average mathematically-literate human would easily understand as straightforwardly asking it to draw a pentagon..."
E.g.,
- "a street sign with a red pentagon on top of it"
- "a red, upside-down five-sided figure"
- "a pentagon next to a hexagon"
- "two yellow pentagons filled with honey, next to a bee"
- "a polygon with two fewer than seven sides"
(And I would argue that even much harder stuff than that should be included.)
@Jacy From this last bit “…it can correctly draw a chair in many different contexts and styles, even if it misunderstands related instructions like "draw a cow sitting in the chair"”
I’m not sure if we should be imagining that the “cow sitting in the chair”-style prompts don’t have to work at all, or that “cow sitting in the chair”-style prompts should at least produce a pentagon, regardless of whether any of the other details are correct?
@Jacy I think "a street sign with a red pentagon on top of it" is anything but straightforward. I can easily imagine a human absentmindedly drawing a stop sign with that prompt.
Same with "two fewer than seven sides". I would have thought we're testing knowledge of what a pentagon looks like, not multi-stage reasoning
@JimHays yeah, I was assuming the latter, but it would be nice to have clarification. However, my sense is that current systems can easily understand a cow and a chair, but the sitting relationship is challenging, and that's a different problem than something like "a pentagon next to a hexagon," where neither pentagons, hexagons, nor the "next to" relationship are challenging—only the mental process of avoiding crossing one's wires as all current systems do with such prompts.
@aashiq I would bet a huge amount at short odds that the average person (e.g., Prolific survey participant) would have no trouble at all with such prompts, and I'd be quite surprised if anyone would bet against that at, say, even odds. So I still think "straightforward" is a very reasonable description. shrug
@cadca Might be worth trying again, now that Imagen3 is part of Gemini. Try prompting with "Can you use imagen3 to create an image of the geometrical shape called a pentagon?"