Will DALLE-3 be able to draw a frog riding a bird? (50% success rate)
resolved Oct 21


"A frog riding on top of a bird"

I will run the prompt 10x. If it can produce it correctly 50% of the time, this market will resolve to YES, otherwise NO.

Unfortunately it's not easy to rigorously define judgment, so I will use my own observation to decide if the picture really shows a frog riding a bird. I will also take inputs from @StrayClimb prior to resolution.

Get Ṁ600 play money

🏅 Top traders

#NameTotal profit
Sort by:

Very good success rate

attempt 1: at least 2 are easily correct out of the 4 generated. Arguably 3.

Attempt 2 : img 1 ,2, and 4 are correct, 3rd is not.

Attempt 3: img 1,2,3 are correct, 4 is also arguably correct.

Attempt 4: gets it in img 1, 2, and 3

The rest of the attempts are left as an exercise to the reader.

bought Ṁ1,000 of YES

First try via Bing (which apparently is DALL-E 3 as of today)

bought Ṁ1,000 of YES

@chrisjbillington second and third attempts, this time with the exact prompt as specified in this market. 11/12 correct so far.

predicted YES

@chrisjbillington 23/24 correct in 6 fresh bing chat sessions generating 4 images per prompting.

bought Ṁ500 of YES

No mention of hit rate here but seems very likely

predicted YES

@firstuserhere There is this:

to be clear yall it’s not 100% accurate every time, obviously selecting some of the best. the point is that all of this is possible within 2-3 generations due to gpt-4 pair prompting with you until it gets it right.

which to me sounds like a pretty good hit rate, but it's not super clear.

Cool market. I could use a lot more mj or dalle3 tests. Can it add, etc. You can also use it to generate observations such as by having it generate say 100 men and 100 women and then seeing if the height averages match up to human distributions. And then do the same thing for "person in 1870". What I'm actually saying is that the best image generation actually requires full AGI at the limit