Will DALLE-3 create correct text (for individual words) in images upon release?
76
1.4kṀ20k
resolved Nov 21
Resolved
YES

The following words will be used:

['apple', 'cloud', 'stone', 'river', 'brush', 'flame', 'grass', 'pizza', 'metal', 'sugar'].

The following prompt will be used:

I will run a prompt: "picture of a large overgrown concrete building with a large neon sign that says WORD on top"

where WORD will be replaced by a word from the list above.

I will run the prompt 10 times for each word. If there are 4 images per prompt, we will get 40 images per word. For 10 words, we have 400 images.

DALLE-3 must score >=50% accuracy for each word, for that word to be called "CORRECTLY SPELLED".

DALLE-3 must spell >50% of the words as CORRECTLY SPELLED for this market to resolve to YES.

I will be blind to case i.e. both upper case and lower case spellings, as well as a mixture of cases is acceptable.

Get
Ṁ1,000
to start trading!

🏅 Top traders

#NameTotal profit
1Ṁ1,199
2Ṁ206
3Ṁ164
4Ṁ119
5Ṁ99
Sort by:
predictedYES

By release do you mean public release? Seems like it is release for most people who have ChatGPT Plus

predictedNO

@EliLifland I plan to test soon (today)

predictedNO
predictedNO

@firstuserhere hey are you okay?

predictedYES

Well, looks like it's going to have some trouble with "apple."

(On the other hand, it was 3/4 on "pizza" and 3.5/4 on "sugar" for me.)

predictedYES

@firstuserhere Is "apple" correctly spelled if it uses the Apple logo instead?

predictedNO

@IsaacKing lmao this is wonderful

Will the word be in quotation marks or anything, or just plain:

picture of a large overgrown concrete building with a large neon sign that says apple on top

vs

picture of a large overgrown concrete building with a large neon sign that says "apple" on top

When you say 10 tries, do you mean 10 images, or 10 generations of however many images show up? If you mean the latter, how many have to be spelled correctly to count as a successful try?

predictedNO

@EliLifland edit in description

predictedYES

@firstuserhere And to confirm, you will run the prompt 10 times for each word

predictedYES

@EliLifland Also, unlikely to make a difference But >= 20 rather than >20 seems more consistent with original >=5/10

predictedNO

@EliLifland Yes, that's what was the original criteria, won't change that. Edited comment about to say >50%

predictedNO
predictedYES

@firstuserhere Wait so it has to get >50% of all images correct across all prompts? This seems pretty different from having to get >=50% of images right for >50% of the words

predictedYES

@EliLifland For exampe, if it totally fails at 3 words and gets 60% right for the other 7, previously that would have been a yes but now it’s a no?

predictedNO

@EliLifland For a word, if it gets >50% correct, that word is correctly spelled

predictedYES

@firstuserhere Ok, I misunderstood what you meant by >50% accuracy. Got it

predictedYES

i think it should say >50% accuracy for the total one rather than >=50%, I believe it used to be >= for each word but > for counting scores across words

predictedNO

@EliLifland Yeah, I will edit it just now to make that clear. Thanks again for pointing it out!

predictedNO

@EliLifland Looks good now?

predictedYES

@firstuserhere Yeah much better. I think the only remaining confusing part is you say you will run 9 prompts for the other 9 words, which sounds like one prompt per word. Maybe edit that part to make it clear you will run 10 prompts for each word, or just say you will do the same thing for each other word

predictedNO

@EliLifland Sure thing, the wording was a mess to begin with in this description

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules