When will I mistake an AI-generated short story for a story from the New Yorker, in at least 1 of 5 tries?

703Ṁ2199

2029

2023

2024

32%

2025

26%

2026

11%

2027

24%

Other

Each year, sometime in December, I will ask a trusted person (called the "curator") to pick 5 short stories from the New Yorker. For each story, they will generate a short story with the same opening using an AI text generator. I will read each pair of stories and guess which one is AI-generated. If I guess wrong at least once, the market will resolve to the current year.

When to run the experiment:

Between December 1-31 of the year in question, I will run a test version of the experiment by myself with a few short stories.
If I judge the stories generated by the AI to be very obviously AI-written, I may skip the experiment.
If I have already run the experiment or decided not to run it, and then an improved AI text generator comes out during the month of December, I may re-run the experiment, but will not be obligated to.
If I am unable to run the experiment, I will find a trusted person without a stake in this market to do it instead. They will pick their own curator.

Process of the experiment:

I will pick what I judge to be the best available AI text generator that can generate 5 stories for less than $50, requiring less than two hours to set up and generate all 5 stories.
The curator will look at the New Yorker's most recent fiction (https://newyorker.com/magazine/fiction/). If the fiction is no longer available or we run out of eligible stories, we will choose a suitable alternative.
For each story, starting with the most recent, we will follow this process, until we have found 5 eligible stories.
- If the story is less than 4,000 words or more than 8,000 words, was generated by an AI, has a visual component such as pictures that are crucial to the story, or anything else would make it easy to tell that this story comes from the New Yorker, we will skip the story. Otherwise, we will use it.
For each story, the curator will do the following to generate another story:
- The prompt will start as follows:
  - Continue the story based on the opening below. The story should be good enough to be published in the New Yorker and should be <WORD COUNT> words long. Repeat the opening first, then continue from there.
  - <WORD COUNT> is the number of words in the human-written story.
- The curator will also append the first paragraph(s) of the original story, just enough to make 100 words or more.
- If necessary, the curator may retry, change the prompt, or cut parts of the responses in order to get a satisfactory answer that is more than 4,000 words and less than 8,000 words. For instance, if the AI is accessed via a chat interface with limited text per response, they might need to add a prompt like "At the end of each response, tell me CONTINUE if you want to continue, or STOP if the story is done."
- I will not see the prompt until after the experiment.
The curator will paste both stories into a Google Doc, using a computer to randomly choose their order.
I will read the two stories and guess which one was AI-generated.
If I am wrong about any of the 5 pairs of stories, the market resolves to the current year.

I will not trade in this market, nor will any curator I choose.

This market is based on the format of https://manifold.markets/CalebBiddulph/when-will-i-mistake-an-aigenerated.

Technical AI Timelines

Fiction

Get

1,000

to start trading!

People are also trading

In 2025, will an AI be able to generate a full high-quality short story to a prompt?

64% chance

In 2028, will an AI be able to generate a full high-quality novel to a prompt?

59% chance

Will a book claimed to be written by an AI make the NYT best seller list before the end of 2025?

13% chance

Will a novel published by a 'Big Five' publisher turn out to be written by an AI by 2028?

58% chance

Will AI convincingly mimic Scott Alexander's writing in style, depth, and insight before 2026?

7% chance

[Metaculus] By 2050, will at least 25% of #1 NYT Best Selling Fiction be primarily written by AI?

65% chance

Will an AI be able to write a passable novel before 2028?

78% chance

Will there be an AI "lost manuscript" hoax by the end of 2025?

42% chance

Will at least five purely-AI-written books make it onto the New York Times fiction bestseller list before 2031?

64% chance

Will any best-selling work of fiction be written entirely by an AI by April of 2028?

Sort by:

bought Ṁ5 NO

@CDBiddulph it's time for your 2024 attempt ?

@Odoacre Oof, it is. I didn't get around to having a curator put together the story pairs for me in December, and I guess it's not going to happen at this point.

I did generate two of the stories myself with Claude and am pretty convinced I wouldn't be fooled by either of them, although they're not as obviously bad as last year's. Reading 10 stories is more of a time investment than I thought it would be, and even though I don't really think I would be fooled, I don't really have time to formally attempt it this year. I also feel bad asking someone to do the tedious work of copying and pasting the New Yorker stories, removing all the formatting artifacts, handling the CONTINUE and STOP keywords from the LLM, etc. In 2025, it'll be less friction if I can reliably get an AI like Operator to do that work for me, and the stories are probably going to be a lot better next year, so I feel that it's pretty likely I'll actually do the work this December. Not the greatest excuses, but there you go. I'm going to resolve 2024 to NO

As a test run, I prompted GPT-4 to complete 3 example stories from the New Yorker. Here are their closing paragraphs, which I consider to all be very obviously AI-generated. The thing about today's LLMs writing fiction is that they are VERY heavy-handed with symbolism and conclude in a very pat, cheesy way. If nothing miraculous happens by December 31, I will consider 2023 to resolve NO.

"The opal ring now lives with us, not locked away in a box, but worn, loved, and shared. We wear it on special occasions, but also on ordinary days – on days when we want to feel close to our parents, on days when we want to remember the strength of their love for each other and for us. It’s a constant reminder that love transcends life and death, that stories can keep the essence of the departed alive, and that even in loss, we can find treasures that can last a lifetime. And in the end, isn't it the story that makes the treasure?"

"And though the world outside her windows and mirrors might never understand, for Angela, it's enough. It's her ritual, her redemption, her reflection. And in the end, isn't that all we are? A reflection of our choices, our experiences, our memories. For now, Angela chooses to live in her reflections, in the dance of light and shadow on the glass, in the dust motes swirling in the sunbeam, in the scent of jasmine wafting in from her neighbor's garden. And perhaps, in time, she will choose to step out, to leave the confines of her mirrors and windows. But for now, she is content. She is seen. She is here."

"Before that year, I knew nothing about Colombia—nothing real. But by the end of it, I had learned a great deal. I learned about its history, its people, and their struggles. But more than that, I learned about the power of stories and the complexities of life. That year shaped me. It made me who I am today—a curious observer, a relentless questioner, and a storyteller at heart. And for that, I am forever grateful."

Have you tried this experiment before? If so, what were the results?

How much do you read the New Yorker currently (e.g. how familiar are you with their editorial choices/writing style?

@RobertCousineau I haven't gotten to try it with GPT-4 yet, I've tried a few times with ChatGPT 3.5 or Bard. Some individual paragraphs would be fairly passable on their own, but I would always get some paragraphs like this, which feel obviously written by an LLM. It's like if a marketing copywriter wrote fiction:

"As the evening unfolds, Angela revels in the connection with friends old and new. She finds herself laughing, sharing, and embracing the present moment. She is surrounded by a community that she has built, a testament to the power of art to connect, to heal, and to create beauty from even the most broken parts of our lives."

Probably the clearest sign that these stories were written by LLMs is that the overall plot doesn't really have anything interesting to say - everything occurs very predictably. I'll try it with GPT-4 in December, but I'm not optimistic that I'll even run the experiment this year.

I've never really read the New Yorker before, I just chose it as the most well-known place to find short stories.