Will DALL-E 3 be capable of generating the word "rationalussy"?
102
1.2k
1.9k
resolved Oct 12
Resolved
YES

I haven't the slightest idea what this word is supposed to mean, but let's find out if DALL-E 3 can spell it correctly!

Whenever DALL-E 3 is available to me (which should be soon-ish, through OpenAI Plus), I will give it the following prompt:

A photograph of a group of nerds standing around a whiteboard, which is covered in complex graphs. The word "rationalussy" is written in large letters in one corner of the whiteboard.

I will repeat this enough times to generate 10 images. If the word "rationalussy" appears with the correct spelling in at least one of the resulting images, this market will resolve YES. Otherwise, it will resolve NO.

For comparison, here's an example of DALL-E 2's performance with that prompt:

Considerations:

  • If one or two letters are obscured by foreground characters or objects (as in the example image above), but the majority of the word is spelled correctly and the intent is reasonably clear, I'll try to give the model the benefit of the doubt.

  • If the word "rationalussy" is flagged by safety filters or whatever, and the model refuses to generate an image for it, the market will resolve NO. (I'll try a couple of alternate prompts if this happens.)

  • I will not be betting in this market.

Get Ṁ600 play money

🏅 Top traders

#NameTotal profit
1Ṁ670
2Ṁ471
3Ṁ241
4Ṁ204
5Ṁ201
Sort by:

what have we become

All right, I checked tonight, and I now have DALL-E 3 enabled through ChatGPT!

Like the original DALL-E 2 interface, it generates a batch of four images for each prompt. Unlike DALL-E 2, it runs your original prompt through some kind of transformation logic to derive four unique prompts from it (mostly by inserting the word "diverse" somewhere in the middle), and then generates one image from each of those derived prompts. I don't know if the contents of previous conversation turns are taken into account in the ChatGPT interface, but I did start a new conversation thread for each of my prompts. Three prompts, each generating four images, gave me 12 images total.

Sadly, there's currently no way to share a thread containing generated images, so here are the screenshots!

My count is 3 correct in the second set, and 1 correct in the third set, for a total of 4/12 = 33.3%. I definitely didn't expect it to be this high based on what other people are reporting, but maybe it's been improved since the first batch of users. Regardless, this clearly resolves YES.

For eye candy, here's the full-sized version of one of the correct images:

The derived prompt used to generate this particular image was:

A candid photo capturing a moment where a diverse set of individuals, identifiable as nerds by their attire, are deeply engrossed in a brainstorming session. The whiteboard behind them is filled with complex graphs, and the word 'rationalussy' is written largely in a corner.

bought Ṁ20 of YES

3/23 on ChatGPT before hitting rate limits.

sold Ṁ116 of YES

@chrisjbillington You must be talking about prompts rather than images to get a 23 in the denominator, right? Or can rate limits interrupt a prompt?

predicted NO

@chrisjbillington Are they rolling it out slowly you think? I don’t see any options for dalle-3 on my chatgpt plus, am I missing something

bought Ṁ20 of NO

@EliLifland Yeah it's a slow rollout, I don't have it yet either. Probably has to be, otherwise they'd be swamped.

I'm expecting it to be available to everyone by the 15th, likely earlier, since they promised "early October"

@EliLifland I've been checking occasionally too, and I haven't seen it unlock for me yet. I wonder if they'll send an email announcement, or if it'll just randomly start working one day.

I did see the image upload button pop up in my ChatGPT interface this morning, so some kind of rollout is definitely happening.

predicted YES

@Frogswap 23 images, it sometimes fails to generate 4 images per prompting.

But chatGPT is coming up with unique prompts for each image based on your meta-prompt, so at the level of DALLE-E 3 itself it was 23 images and 23 promptings. But at the level of ChatGPT it was 6 promptings.

predicted YES

Just got it on the 13th try. Which actually technically drew two winners, the first one is open to interpretation but second one is a clear winner.

So technically 2/52 or 1/52 if we’re being strict.

bought Ṁ150 of NO

Just drew 0/24. So if it's 1/25 probability (by law of succession), there's a probability of 33.5% of drawing at least one positive.

sold Ṁ36 of YES

@Nikola Or do we add your results to Chris's, take the new base rate of 1/42, and get ~25.1%? I guess it depends a little on whether you would have commented had you gotten different results

bought Ṁ30 of YES

"Enough times to generate 10 images" as in you'll only use the first ten images or you'll use any images generated after there are at least 10? It's a ~6% difference in odds from the 1/30 base rate

@Frogswap The latter (however many images result in total) is what I had in mind when I wrote the description. So if DALL-E 3's interface produces images in batches of 4 (like DALL-E 2), there would end up being 12 candidate images. Mainly because I don't think it would be fair to the model to reject a perfectly good image just because it was the 11th one generated instead of the 9th.

bought Ṁ75 of YES

@NLeseul That means this market should be around 1/3, unless others have a more refined base rate than me

bought Ṁ600 of NO

DALLE-3 is now available via Bing (to my understanding). I'm seeing two correctly-spelled out of 60 images generated.

predicted NO

@chrisjbillington Show me one of the correct ones. You cannot hide this beauty from the world

predicted NO

@Frogswap Edited - bottom-left is correct

predicted NO

@chrisjbillington It's everything I imagined. Except the dip at the end, what the fuck is that?

That's this market

bought Ṁ200 of NO

@Joshua 😂