Will DALLE-3 create correct text (for individual words) in images upon release?
Will DALLE-3 create correct text (for individual words) in images upon release?
76
1.4kṀ20k
resolved Nov 21
Resolved
YES

The following words will be used:

['apple', 'cloud', 'stone', 'river', 'brush', 'flame', 'grass', 'pizza', 'metal', 'sugar'].

The following prompt will be used:

I will run a prompt: "picture of a large overgrown concrete building with a large neon sign that says WORD on top"

where WORD will be replaced by a word from the list above.

I will run the prompt 10 times for each word. If there are 4 images per prompt, we will get 40 images per word. For 10 words, we have 400 images.

DALLE-3 must score >=50% accuracy for each word, for that word to be called "CORRECTLY SPELLED".

DALLE-3 must spell >50% of the words as CORRECTLY SPELLED for this market to resolve to YES.

I will be blind to case i.e. both upper case and lower case spellings, as well as a mixture of cases is acceptable.

Get
Ṁ1,000
to start trading!

🏅 Top traders

#NameTotal profit
1Ṁ1,199
2Ṁ206
3Ṁ164
4Ṁ119
5Ṁ99


Sort by:
1y
predictedYES 1y

By release do you mean public release? Seems like it is release for most people who have ChatGPT Plus

predictedNO 1y

@EliLifland I plan to test soon (today)

predictedNO 1y
predictedNO 1y

@firstuserhere hey are you okay?

predictedYES 1y

Well, looks like it's going to have some trouble with "apple."

(On the other hand, it was 3/4 on "pizza" and 3.5/4 on "sugar" for me.)

predictedYES 1y

@firstuserhere Is "apple" correctly spelled if it uses the Apple logo instead?

bought Ṁ1 NO at 93% 1y
predictedNO 1y

@IsaacKing lmao this is wonderful

1y

Will the word be in quotation marks or anything, or just plain:

picture of a large overgrown concrete building with a large neon sign that says apple on top

vs

picture of a large overgrown concrete building with a large neon sign that says "apple" on top

1y

When you say 10 tries, do you mean 10 images, or 10 generations of however many images show up? If you mean the latter, how many have to be spelled correctly to count as a successful try?

predictedNO

@EliLifland edit in description

predictedYES 1y

@firstuserhere And to confirm, you will run the prompt 10 times for each word

predictedYES 1y

@EliLifland Also, unlikely to make a difference But >= 20 rather than >20 seems more consistent with original >=5/10

predictedNO 1y

@EliLifland Yes, that's what was the original criteria, won't change that. Edited comment about to say >50%

predictedNO 1y
predictedYES 1y

@firstuserhere Wait so it has to get >50% of all images correct across all prompts? This seems pretty different from having to get >=50% of images right for >50% of the words

predictedYES 1y

@EliLifland For exampe, if it totally fails at 3 words and gets 60% right for the other 7, previously that would have been a yes but now it’s a no?

predictedNO 1y

@EliLifland For a word, if it gets >50% correct, that word is correctly spelled

predictedYES 1y

@firstuserhere Ok, I misunderstood what you meant by >50% accuracy. Got it

predictedYES

i think it should say >50% accuracy for the total one rather than >=50%, I believe it used to be >= for each word but > for counting scores across words

predictedNO 1y

@EliLifland Yeah, I will edit it just now to make that clear. Thanks again for pointing it out!

predictedNO 1y

@EliLifland Looks good now?

predictedYES 1y

@firstuserhere Yeah much better. I think the only remaining confusing part is you say you will run 9 prompts for the other 9 words, which sounds like one prompt per word. Maybe edit that part to make it clear you will run 10 prompts for each word, or just say you will do the same thing for each other word

predictedNO 1y

@EliLifland Sure thing, the wording was a mess to begin with in this description

What is this?

What is Manifold?
Manifold is the world's largest social prediction market.
Get accurate real-time odds on politics, tech, sports, and more.
Or create your own play-money betting market on any question you care about.
Are our predictions accurate?
Yes! Manifold is very well calibrated, with forecasts on average within 4 percentage points of the true probability. Our probabilities are created by users buying and selling shares of a market.
In the 2022 US midterm elections, we outperformed all other prediction market platforms and were in line with FiveThirtyEight’s performance. Many people who don't like betting still use Manifold to get reliable news.
ṀWhy use play money?
Mana (Ṁ) is the play-money currency used to bet on Manifold. It cannot be converted to cash. All users start with Ṁ1,000 for free.
Play money means it's much easier for anyone anywhere in the world to get started and try out forecasting without any risk. It also means there's more freedom to create and bet on any type of question.
© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules