Gary Marcus made a post discussing the Imagen and DALLE-2 model's inability to fully grasp language, particularly around relational undestanding of objects in a prompt: https://garymarcus.substack.com/p/horse-rides-astronaut
OpenAI just released DALLE-3, https://openai.com/dall-e-3, which they claim "represents a leap forward in our ability to generate images that exactly adhere to the text you provide".
Once publicly available, I will run this prompt from DeWeese lab that is discussed heavily in the post:
A red conical block on top of a grey cubic block on top of a blue cylindrical block, with a green cubic block nearby
I will produce 10 images. If 5 or more of the images match the prompt exactly, following the color, shape, and positions specified in the prompt, this market resolves YES. Otherwise, it resolves NO.
I will not bet in this market in case there is ambiguity on some of the images.
🏅 Top traders
# | Name | Total profit |
---|---|---|
1 | Ṁ289 | |
2 | Ṁ270 | |
3 | Ṁ179 | |
4 | Ṁ139 | |
5 | Ṁ116 |