Dalle-3 via openai's website can directly annotate images by mid 2024

Ṁ130Ṁ1.4k

resolved Jul 1

Resolved

ALL

submit image, say of a cat
say something like "can you annotate this image with your notes on what parts of the cat you see? put 5 of them in bold black text"
dalle3 doesn't just "reproduce" an image using image => text => dalle3 generation again, (which looks very little like the original)
instead, it draws on top of the original image
so the original is basically present, but with modifications overlaid
it has to be through the normal UI

Interestingly, you can get it to output json blocks explaining what it sees in each region and then use python to look at that. But it's kind of messy and what it sees doesn't seem to be captured by that type of output well. I wonder if you can get the full embedding?

Today: fail