I haven't the slightest idea what this word is supposed to mean, but let's find out if DALL-E 3 can spell it correctly!
Whenever DALL-E 3 is available to me (which should be soon-ish, through OpenAI Plus), I will give it the following prompt:
A photograph of a group of nerds standing around a whiteboard, which is covered in complex graphs. The word "rationalussy" is written in large letters in one corner of the whiteboard.
I will repeat this enough times to generate 10 images. If the word "rationalussy" appears with the correct spelling in at least one of the resulting images, this market will resolve YES. Otherwise, it will resolve NO.
For comparison, here's an example of DALL-E 2's performance with that prompt:
Considerations:
If one or two letters are obscured by foreground characters or objects (as in the example image above), but the majority of the word is spelled correctly and the intent is reasonably clear, I'll try to give the model the benefit of the doubt.
If the word "rationalussy" is flagged by safety filters or whatever, and the model refuses to generate an image for it, the market will resolve NO. (I'll try a couple of alternate prompts if this happens.)
I will not be betting in this market.
what have we become
All right, I checked tonight, and I now have DALL-E 3 enabled through ChatGPT!
Like the original DALL-E 2 interface, it generates a batch of four images for each prompt. Unlike DALL-E 2, it runs your original prompt through some kind of transformation logic to derive four unique prompts from it (mostly by inserting the word "diverse" somewhere in the middle), and then generates one image from each of those derived prompts. I don't know if the contents of previous conversation turns are taken into account in the ChatGPT interface, but I did start a new conversation thread for each of my prompts. Three prompts, each generating four images, gave me 12 images total.
Sadly, there's currently no way to share a thread containing generated images, so here are the screenshots!
My count is 3 correct in the second set, and 1 correct in the third set, for a total of 4/12 = 33.3%. I definitely didn't expect it to be this high based on what other people are reporting, but maybe it's been improved since the first batch of users. Regardless, this clearly resolves YES.
For eye candy, here's the full-sized version of one of the correct images:
The derived prompt used to generate this particular image was:
A candid photo capturing a moment where a diverse set of individuals, identifiable as nerds by their attire, are deeply engrossed in a brainstorming session. The whiteboard behind them is filled with complex graphs, and the word 'rationalussy' is written largely in a corner.
@chrisjbillington You must be talking about prompts rather than images to get a 23 in the denominator, right? Or can rate limits interrupt a prompt?
@chrisjbillington Are they rolling it out slowly you think? I don’t see any options for dalle-3 on my chatgpt plus, am I missing something
@EliLifland Yeah it's a slow rollout, I don't have it yet either. Probably has to be, otherwise they'd be swamped.
I'm expecting it to be available to everyone by the 15th, likely earlier, since they promised "early October"
@EliLifland I've been checking occasionally too, and I haven't seen it unlock for me yet. I wonder if they'll send an email announcement, or if it'll just randomly start working one day.
I did see the image upload button pop up in my ChatGPT interface this morning, so some kind of rollout is definitely happening.
@Frogswap 23 images, it sometimes fails to generate 4 images per prompting.
But chatGPT is coming up with unique prompts for each image based on your meta-prompt, so at the level of DALLE-E 3 itself it was 23 images and 23 promptings. But at the level of ChatGPT it was 6 promptings.
Just got it on the 13th try. Which actually technically drew two winners, the first one is open to interpretation but second one is a clear winner.
So technically 2/52 or 1/52 if we’re being strict.
@Nikola Or do we add your results to Chris's, take the new base rate of 1/42, and get ~25.1%? I guess it depends a little on whether you would have commented had you gotten different results
@Frogswap The latter (however many images result in total) is what I had in mind when I wrote the description. So if DALL-E 3's interface produces images in batches of 4 (like DALL-E 2), there would end up being 12 candidate images. Mainly because I don't think it would be fair to the model to reject a perfectly good image just because it was the 11th one generated instead of the 9th.
@NLeseul That means this market should be around 1/3, unless others have a more refined base rate than me
DALLE-3 is now available via Bing (to my understanding). I'm seeing two correctly-spelled out of 60 images generated.
@chrisjbillington Show me one of the correct ones. You cannot hide this beauty from the world
@chrisjbillington It's everything I imagined. Except the dip at the end, what the fuck is that?