Will DALLE-3 create correct text (for individual words) in images upon release?
➕
Plus
76
Ṁ20k
resolved Nov 21
Resolved
YES

The following words will be used:

['apple', 'cloud', 'stone', 'river', 'brush', 'flame', 'grass', 'pizza', 'metal', 'sugar'].

The following prompt will be used:

I will run a prompt: "picture of a large overgrown concrete building with a large neon sign that says WORD on top"

where WORD will be replaced by a word from the list above.

I will run the prompt 10 times for each word. If there are 4 images per prompt, we will get 40 images per word. For 10 words, we have 400 images.

DALLE-3 must score >=50% accuracy for each word, for that word to be called "CORRECTLY SPELLED".

DALLE-3 must spell >50% of the words as CORRECTLY SPELLED for this market to resolve to YES.

I will be blind to case i.e. both upper case and lower case spellings, as well as a mixture of cases is acceptable.

Get
Ṁ1,000
and
S3.00
Sort by:
predicted YES

By release do you mean public release? Seems like it is release for most people who have ChatGPT Plus

predicted NO

@EliLifland I plan to test soon (today)

predicted NO
predicted NO

@firstuserhere hey are you okay?

predicted YES

Well, looks like it's going to have some trouble with "apple."

(On the other hand, it was 3/4 on "pizza" and 3.5/4 on "sugar" for me.)

predicted YES

@firstuserhere Is "apple" correctly spelled if it uses the Apple logo instead?

bought Ṁ1 NO at 93%
predicted NO

@IsaacKing lmao this is wonderful

Will the word be in quotation marks or anything, or just plain:

picture of a large overgrown concrete building with a large neon sign that says apple on top

vs

picture of a large overgrown concrete building with a large neon sign that says "apple" on top

When you say 10 tries, do you mean 10 images, or 10 generations of however many images show up? If you mean the latter, how many have to be spelled correctly to count as a successful try?

predicted NO

@EliLifland edit in description

predicted YES

@firstuserhere And to confirm, you will run the prompt 10 times for each word

predicted YES

@EliLifland Also, unlikely to make a difference But >= 20 rather than >20 seems more consistent with original >=5/10

predicted NO

@EliLifland Yes, that's what was the original criteria, won't change that. Edited comment about to say >50%

predicted NO
predicted YES

@firstuserhere Wait so it has to get >50% of all images correct across all prompts? This seems pretty different from having to get >=50% of images right for >50% of the words

predicted YES

@EliLifland For exampe, if it totally fails at 3 words and gets 60% right for the other 7, previously that would have been a yes but now it’s a no?

predicted NO

@EliLifland For a word, if it gets >50% correct, that word is correctly spelled

predicted YES

@firstuserhere Ok, I misunderstood what you meant by >50% accuracy. Got it

predicted YES

i think it should say >50% accuracy for the total one rather than >=50%, I believe it used to be >= for each word but > for counting scores across words

predicted NO

@EliLifland Yeah, I will edit it just now to make that clear. Thanks again for pointing it out!

predicted NO

@EliLifland Looks good now?

predicted YES

@firstuserhere Yeah much better. I think the only remaining confusing part is you say you will run 9 prompts for the other 9 words, which sounds like one prompt per word. Maybe edit that part to make it clear you will run 10 prompts for each word, or just say you will do the same thing for each other word

predicted NO

@EliLifland Sure thing, the wording was a mess to begin with in this description

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules