On August 26th, Eliezer tweeted
(https://twitter.com/ESYudkowsky/status/1563282607315382273):
In 2-4 years, if we're still alive, anytime you see a video this beautiful, your first thought will be to wonder whether it's real or if the AI's prompt was "beautiful video of 15 different moth species flapping their wings, professional photography, 8k, trending on Twitter".
Will this tweet hold up? (The part about AI video generation, not about whether we'll all be dead in 2-4 years.) Giving max date range to be generous.
This market resolves YES if at close (end of 2026) my subjective perception is that this was a good take--e.g., AI-generated video really is that good--and NO if it seems like Eliezer was importantly wrong about something, e.g., AI-generated video still sucks, or still couldn't be the cause for serious doubt about whether some random moth footage was made with a camera or not.
I reserve the right to resolve to an early YES if it turns out Eliezer was obviously correct before the close date. I won't dock points if he ends up having been too conservative, e.g., a new model comes out in 6 months with perfect video generation capabilities.
I guess this market resolves N/A if we all die, but, well, y'know.
Betting policy: I will not bet in this market (any more than I already have, and I've long sold all my shares).
I'm already at the point where i double check to make sure short political vidoes arent AI
(How do I include my bet as part of my comment?)
AI-generated video gets >11k upvotes on Reddit: https://www.reddit.com/r/interestingasfuck/comments/1gh3is9/mothers_love_is_universal/
We're pretty much there already.
This market resolves YES if at close (end of 2026) my subjective perception is that this was a good take--e.g., AI-generated video really is that good
based on this, my understanding is that the tweet holds up if video is so good it will be difficult to differentiate rather than how people feel about it
Here's someone generating a butterfly with Sora:
https://twitter.com/Aibot_App/status/1763076397810299017
What do you think? Close but not quite there yet, IMO.
As I understand it, this market only resolves YES if there's realistic video generation AND no mechanism in place to reliably tell apart AI-generated from real video on Twitter.
I think that conditioning on realistic video, there's like at least a 20% chance civilization is forced to find a way to mark videos on Twitter as either real or AI-generated, so this question should be trading at most around 75%.
@Nikola This question is not directly measuring AI video generation. Instead, it's measuring how much doubt people will feel when they look at a video on twitter. And there are ways to make the amount of doubt low!
@journcy Great market! Can you clarify the resolution criteria? To condense some of the questions raised in this and related markets:
Eliezer's tweet refers to a single prompt describing "15 different moth species," so I take this as requiring the AI to be able to take a single prompt and produce the video in the tweet. (Of course, the AI could internally generate clips and put them together, but a human can't be the one selecting an assembling.) Is that correct, or are you just requiring the system to be able to generate an individual clip of comparable quality to the individual clips in the moth video?
Who needs to be tricked, and for how long? Eliezer's tweet refers to "anytime you see a video this beautiful." I take "you" to mean something like Eliezer's average Twitter follower, but you could instead mean the average Manifold user, the average human, etc. You could mean someone's first reaction (e.g., first quarter-second), first tangible thoughts (e.g., first two seconds), enough time to make a brief Twitter reply (e.g., first ten seconds), etc. More time means more opportunity to spot issues like those in DALL-E 3 or Midjourney 6, though people may still have an 'eerie' suspicion within the first second or two, even without scaling up the video.
How often do they need to be tricked? I take "anytime" to be a pretty strong statement. Does this require such AI videos be pretty easy to produce, or will just one or two cherry-picked examples shared by a company in a showcase like the Sora announcement count?
Does YES resolution require people to be actually questioning videos they see out in the real world? E.g., if the current Sora videos counted, because there are only a limited number, presumably people like Eliezer's followers will not be tricked for very long because they will have seen them. So do people need to actually be questioning the beautiful videos they see in their feeds?
I could ask more, but I'll leave it there—thanks!
@Jacy I'm not planning on laying out a very detailed or precise set of criteria--I think I was pretty explicit in the description that this market is about my own personal take on the original tweet.
That being said, I can elaborate a bit on how I'm thinking about this, with the rider that I don't consider any of these notes to be binding.
To me, this tweet will have stood the test of time if it is in some sense embedded in the zeitgeist that a level of skepticism is warranted for essentially any video you see online. (With obvious caveats about things you know existed prior to the 2020s, etc.)
The central claim Eliezer makes IMO is not so much about the quality of videos being generated as it is the mindset people will be in given what capabilities are available and in use. Where, yeah, "people" is kinda vague; probably something closer to "person following EY on Twitter" than "literally any human alive" or even "any English-speaking person," but also if it's literally just LessWrongers or whatever being paranoid that might be kind of on the fence for me.
To perhaps illustrate, I still haven't even watched the whole moth video--I saw this tweet back in August 2022, maybe watched a couple seconds of the video, said "oh my god no I/we/they won't be thinking that," and came over to Manifold in a huff to make a market so I could hold EY accountable for what to me seemed like a bad take. (cf. the several hundred mana I've lost buying and holding NO before I decided against trading here.) And I guess good thing I did, since it's starting to look pretty clear my skepticism was wildly overconfident.
@journcy How does this resolve if the videos themselves are super realistic but we have watermarking that's good enough that most people aren't fooled by AI videos? I'm imagining a scenario where, without watermarks/disclaimers/AI labels/warnings, the videos would pass as realistic to someone in the year 2022, but the watermarks themselves reveal that it's AI?
It's unclear to me whether it's in the spirit of the question to measure AI video generation progress, or the actual claim that "your first thought will be to wonder whether it's real". No need to have that thought if there's a big watermark telling you it's not real.
@Nikola I think if watermarking is established comprehensively enough that you don't have to wonder if something was AI-generated, then the tweet doesn't hold up. My understanding is EY thinks watermarking is a good idea but doesn't expect it to actually happen due to coordination failure; if we somehow pull it off, 2022!Eliezer is probably surprised and his prediction is probably falsified.
Resolves Yes now that Sora is out and already generating videos like this: edited to say resolves soon because we're well on our way https://x.com/duborges/status/1758196706733068326?s=20
@Adam actually, I think there are a lot of one-second clips you could make from the showcased Sora videos that Eliezer Yudkowsky's average Twitter follower would be quite uncertain whether they're real or AI—at least in the first second or two—but the moth video is vastly harder to generate than those clips.
@Jacy Agreed - this increases my probability a lot, but it's not at the point that the original Tweet describes. At the very least, give it a few days to see that it can give good results across a range of "videos this beautiful".
@Jacy seems similar to me. What difference are you referring to? @Adam I guess I'm in that crowd and am already fooled 🥴
@AndrewMcKnight I made a comment in the 2024 version of the market pointing out some, but the list could go on, and there are a lot of different ways to taxonomize.