Will Stable Diffusion 3 be consistently capable of generating the word "rationalussy"?
15
102
270
resolved Apr 17
Resolved
NO

I haven't the slightest idea what this word is supposed to mean, but let's find out if Stable Diffusion 3 can spell it correctly!

Whenever Stable Diffusion 3 is available to me or someone willing to test, I'll run the following prompt:

A photograph of a group of nerds standing around a whiteboard, which is covered in complex graphs. The word "rationalussy" is written in large letters in one corner of the whiteboard.

I will repeat this enough times to generate 5 images. If the word "rationalussy" appears with the correct spelling in a majority of the resulting images, this market will resolve YES. Otherwise, it will resolve NO.

For comparison, here's an example of DALL-E 2's performance with that prompt:

Considerations:

  • If one or two letters are obscured by foreground characters or objects (as in the example image above), but the majority of the word is spelled correctly and the intent is reasonably clear, I'll try to give the model the benefit of the doubt.

  • If the word "rationalussy" is flagged by safety filters or whatever, and the model refuses to generate an image for it, the market will resolve NO. (I'll try a couple of alternate prompts if this happens.)

  • I will not be betting in this market.

Previous, with DALLE-3: /NLeseul/will-dalle-3-be-capable-of-generati

Note that this market only required 1/10, whereas this one is 3/5.

Get Ṁ200 play money

🏅 Top traders

#NameTotal profit
1Ṁ253
2Ṁ155
3Ṁ138
4Ṁ28
5Ṁ24
Sort by:

Image 1 (failure):

Image 2 (failure):

Image 3 (.... failure, I think? Could be arguable the n is close enough):

Image 4 (failure):

Image 5 (failure):

Here is my exact methodology:

Started with this notebook:

https://colab.research.google.com/github/stability-ai/stability-sdk/blob/main/nbs/Stable_Image_API_Public.ipynb

  • Edited the "host" to the sd3 endpoint in the Stable Image cell, after running the initial code.

  • Ran with the screenshotted configuration, arbitrarily updating the seed each time.

I'm disappointed in how it did. I had higher expectations given the sample images. I'll resolve the other markets soon but what to give them some time to trade based on this, just for fun.