This post shows an SVG Xbox controller which is quite well-drawn, supposedly created by a "mystery model." Based on the OP's comments in the replies, it is very likely GPT-4.5:
https://x.com/NotBrain4brain/status/1894285365969584303

Whenever I have access to GPT-4.5, I will ask it "Write SVG code that draws an Xbox controller." (I may also ask a volunteer who has the relevant OpenAI subscription to do this.)
I will render the SVG at https://www.svgviewer.dev/, making any slight modifications needed to get it to run.
If the result looks reasonably close in quality to the "mystery model" image above, I will resolve YES. If it's unclear, I may run the experiment multiple times and choose the best option, since the image above was probably somewhat cherry-picked. If I'm still not sure, I will consider putting it to a poll. Essentially, I aim for the market to resolve YES if it's reasonable to believe that GPT-4.5 generated the top SVG image above.
Update 2025-03-05 (PST) (AI summary of creator comment): After a few days:
If no generated SVG output is produced that is of comparable quality to the mystery model's image within a few days of experimentation, the market will be resolved NO.
This update sets a clear timeframe for concluding the experiment in favor of a NO resolution.
Update 2025-03-06 (PST) (AI summary of creator comment): Prompt Engineering Exclusion:
Only outputs generated using the original, one-line prompt (without additional prompt engineering) will be considered for a YES resolution.
Even if prompt engineering produces an SVG closer in quality to the reference, those results will be disregarded.
🏅 Top traders
# | Name | Total profit |
---|---|---|
1 | Ṁ4,551 | |
2 | Ṁ1,285 | |
3 | Ṁ817 | |
4 | Ṁ164 | |
5 | Ṁ112 |
People are also trading
@joanna Interesting that that changes the result so much, though it's still pretty far from the example image in the description. Also, the description is very clear about the prompt, IMO, so I don't think any prompt engineering results should count.
@MingCat yeah according to the resolution criteria in the question it probably should resolve "no" but it doesn't tell us whether this image was actually from GPT-4.5 with some prompt engineering. And we don't know if the original author prompt engineered it. It seems plausible that this was from GPT-4.5, albeit not producible with a one-line prompt.
@joanna Yeah, it's probably fairest to resolve to NO without considering attempts at prompt engineering. I'm curious what the story is with the Twitter thread - were they using a better prompt, exaggerating significantly, or just lying? But it seems clear that the market should resolve to NO.
https://chatgpt.com/canvas/shared/67c69f952e088191a36180e710c8a8c3
https://chatgpt.com/canvas/shared/67c69fb6ec4881919393567df7eace87
https://chatgpt.com/share/67c6a2c3-0f24-8011-b30c-200ab0e20aea
I tried several times, none are of comparable quality to the mystery model's controller.
@Sketchy Yeah, I will definitely resolve NO if nobody can make it produce images better than that after a few days. I asked the OP of the Twitter thread if they have anything to say about it
I'm curious what the probability would be if I didn't put a ton of liquidity into this market - probably a lot higher, considering every person but one bought YES. It seems counterintuitive that putting more mana into the market makes it less accurate and less interesting... maybe there should be a way to drip-feed liquidity into a market or only add it as needed
@CDBiddulph personally I suspect 1k liquidity might be optimum because if it's much higher than that then the rule of "don't bet more than you're willing to lose" pretty much always vastly dominates over "don't overcorrect the odds"