Will this Yudkowsky tweet hold up?
➕
Plus
533
Ṁ510k
2027
81%
chance

On August 26th, Eliezer tweeted

(https://twitter.com/ESYudkowsky/status/1563282607315382273):

In 2-4 years, if we're still alive, anytime you see a video this beautiful, your first thought will be to wonder whether it's real or if the AI's prompt was "beautiful video of 15 different moth species flapping their wings, professional photography, 8k, trending on Twitter".

Will this tweet hold up? (The part about AI video generation, not about whether we'll all be dead in 2-4 years.) Giving max date range to be generous.

This market resolves YES if at close (end of 2026) my subjective perception is that this was a good take--e.g., AI-generated video really is that good--and NO if it seems like Eliezer was importantly wrong about something, e.g., AI-generated video still sucks, or still couldn't be the cause for serious doubt about whether some random moth footage was made with a camera or not.

I reserve the right to resolve to an early YES if it turns out Eliezer was obviously correct before the close date. I won't dock points if he ends up having been too conservative, e.g., a new model comes out in 6 months with perfect video generation capabilities.

I guess this market resolves N/A if we all die, but, well, y'know.

Betting policy: I will not bet in this market (any more than I already have, and I've long sold all my shares).

Get
Ṁ1,000
and
S3.00
Sort by:

Not sure I see how people are so confident on this.

From @journcy

To me, this tweet will have stood the test of time if it is in some sense embedded in the zeitgeist that a level of skepticism is warranted for essentially any video you see online. (With obvious caveats about things you know existed prior to the 2020s, etc.)

I don't think Sora are others are close to generating many types of common video with any type of realism. Take this example: "a highlight of an NFL play where an interception is returned for a touchdown". I don't have access to Sora but I really doubt it can get the motions of all 22 players right, the physics of the ball being thrown, the quick changes in direction, etc. with any level of detail right. Possible it will get there but it seems to me that would be part of the resolution criteria and we're not particular close...

bought Ṁ100 YES

I'm already at the point where i double check to make sure short political vidoes arent AI

(How do I include my bet as part of my comment?)

AI-generated video gets >11k upvotes on Reddit: https://www.reddit.com/r/interestingasfuck/comments/1gh3is9/mothers_love_is_universal/

We're pretty much there already.

bought Ṁ20 YES

I have no more liquidity so I'll just comment here that this market is insanely low. It's funny how bad you guys are. It should be like 97%, 3% for nuclear war

bought Ṁ100 NO

Maybe this is nitpicking, but for me the difficulty comes in at the point where you're supposed to be able to tell the AI to make 15 of something, and it remembers what 15 is long enough to generate the first 14.

Would you resolve no if people stop caring about whether or not something is AI generated?

This market resolves YES if at close (end of 2026) my subjective perception is that this was a good take--e.g., AI-generated video really is that good

based on this, my understanding is that the tweet holds up if video is so good it will be difficult to differentiate rather than how people feel about it

Here's someone generating a butterfly with Sora:

https://twitter.com/Aibot_App/status/1763076397810299017

What do you think? Close but not quite there yet, IMO.

bought Ṁ125 NO

As I understand it, this market only resolves YES if there's realistic video generation AND no mechanism in place to reliably tell apart AI-generated from real video on Twitter.

I think that conditioning on realistic video, there's like at least a 20% chance civilization is forced to find a way to mark videos on Twitter as either real or AI-generated, so this question should be trading at most around 75%.

bought Ṁ80 NO

@Nikola This question is not directly measuring AI video generation. Instead, it's measuring how much doubt people will feel when they look at a video on twitter. And there are ways to make the amount of doubt low!

Well, he suggested there was any kind of real chance we wouldn't be alive by 2026, so no.

he does not count sora yet

Not resolving yet, because Sora is very new, but I'll be straightforward and say that at the moment I'm thinking I'll be pretty surprised if it takes until the end of 2026 for this market to resolve.

@journcy Great market! Can you clarify the resolution criteria? To condense some of the questions raised in this and related markets:

  1. Eliezer's tweet refers to a single prompt describing "15 different moth species," so I take this as requiring the AI to be able to take a single prompt and produce the video in the tweet. (Of course, the AI could internally generate clips and put them together, but a human can't be the one selecting an assembling.) Is that correct, or are you just requiring the system to be able to generate an individual clip of comparable quality to the individual clips in the moth video?

  2. Who needs to be tricked, and for how long? Eliezer's tweet refers to "anytime you see a video this beautiful." I take "you" to mean something like Eliezer's average Twitter follower, but you could instead mean the average Manifold user, the average human, etc. You could mean someone's first reaction (e.g., first quarter-second), first tangible thoughts (e.g., first two seconds), enough time to make a brief Twitter reply (e.g., first ten seconds), etc. More time means more opportunity to spot issues like those in DALL-E 3 or Midjourney 6, though people may still have an 'eerie' suspicion within the first second or two, even without scaling up the video.

  3. How often do they need to be tricked? I take "anytime" to be a pretty strong statement. Does this require such AI videos be pretty easy to produce, or will just one or two cherry-picked examples shared by a company in a showcase like the Sora announcement count?

  4. Does YES resolution require people to be actually questioning videos they see out in the real world? E.g., if the current Sora videos counted, because there are only a limited number, presumably people like Eliezer's followers will not be tricked for very long because they will have seen them. So do people need to actually be questioning the beautiful videos they see in their feeds?

I could ask more, but I'll leave it there—thanks!

@Jacy I'm not planning on laying out a very detailed or precise set of criteria--I think I was pretty explicit in the description that this market is about my own personal take on the original tweet.

That being said, I can elaborate a bit on how I'm thinking about this, with the rider that I don't consider any of these notes to be binding.

To me, this tweet will have stood the test of time if it is in some sense embedded in the zeitgeist that a level of skepticism is warranted for essentially any video you see online. (With obvious caveats about things you know existed prior to the 2020s, etc.)

The central claim Eliezer makes IMO is not so much about the quality of videos being generated as it is the mindset people will be in given what capabilities are available and in use. Where, yeah, "people" is kinda vague; probably something closer to "person following EY on Twitter" than "literally any human alive" or even "any English-speaking person," but also if it's literally just LessWrongers or whatever being paranoid that might be kind of on the fence for me.

To perhaps illustrate, I still haven't even watched the whole moth video--I saw this tweet back in August 2022, maybe watched a couple seconds of the video, said "oh my god no I/we/they won't be thinking that," and came over to Manifold in a huff to make a market so I could hold EY accountable for what to me seemed like a bad take. (cf. the several hundred mana I've lost buying and holding NO before I decided against trading here.) And I guess good thing I did, since it's starting to look pretty clear my skepticism was wildly overconfident.

bought Ṁ50 NO at 88%
bought Ṁ50 NO

@journcy thanks, that helps!

sold Ṁ21 NO

@journcy How does this resolve if the videos themselves are super realistic but we have watermarking that's good enough that most people aren't fooled by AI videos? I'm imagining a scenario where, without watermarks/disclaimers/AI labels/warnings, the videos would pass as realistic to someone in the year 2022, but the watermarks themselves reveal that it's AI?

It's unclear to me whether it's in the spirit of the question to measure AI video generation progress, or the actual claim that "your first thought will be to wonder whether it's real". No need to have that thought if there's a big watermark telling you it's not real.

@Nikola I think if watermarking is established comprehensively enough that you don't have to wonder if something was AI-generated, then the tweet doesn't hold up. My understanding is EY thinks watermarking is a good idea but doesn't expect it to actually happen due to coordination failure; if we somehow pull it off, 2022!Eliezer is probably surprised and his prediction is probably falsified.

@journcy @Nikola it's also possible that, as you can do today, you can just read the comments section on popular social media videos and almost always see a consensus on whether it's AI-generated. There are also platform-specific labels, such as Twitter/X's community notes.

bought Ṁ500 YES

Resolves Yes now that Sora is out and already generating videos like this: edited to say resolves soon because we're well on our way https://x.com/duborges/status/1758196706733068326?s=20

@shankypanky that is nowhere near the difficulty of the moth video.

bought Ṁ10 NO

@Jacy agreed; do any of these look real to you for even a second?

@Adam actually, I think there are a lot of one-second clips you could make from the showcased Sora videos that Eliezer Yudkowsky's average Twitter follower would be quite uncertain whether they're real or AI—at least in the first second or two—but the moth video is vastly harder to generate than those clips.

@Jacy Agreed - this increases my probability a lot, but it's not at the point that the original Tweet describes. At the very least, give it a few days to see that it can give good results across a range of "videos this beautiful".

@Jacy yeah; these do seem like they may be good enough to fool yud and his crowd.

@Jacy seems similar to me. What difference are you referring to? @Adam I guess I'm in that crowd and am already fooled 🥴

@AndrewMcKnight I made a comment in the 2024 version of the market pointing out some, but the list could go on, and there are a lot of different ways to taxonomize.

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules