Resolves yes if there is a model that receives a natural language description (e.g."Give me a video of a puppy playing with a kitten") and outputs a realistic looking video matching the description.
It does *not* have to be *undetectable* as AI generated, merely "realistic enough".
It must be able to consistently generate realistic videos >=30 seconds long to count.
DALL-E 2 (https://cdn.openai.com/papers/dall-e-2.pdf) counts as "realistic enough" *image* generation from natural language descriptions (I am writing this before the model is fully available, if it turns out that all the samples are heavily cherry picked DALL-E 2 does not count but a hypothetical model as good as the cherry picked examples would).
Duplicate of https://manifold.markets/vluzko/will-there-be-realistic-ai-generate
Dammit seems I missed by a few months and lost tons of mana because of that.
Should have also bough a lot of yes on the 2025 market duno why o didn't do that.
My predictions for these kind of stuff keep being wrong on the seme direction by a few months, I should take that into account in the future I guess.
Duplicate of this market for 2025: https://manifold.markets/vluzko/will-there-be-realistic-ai-generate-476acd1cbfa5
What’s quite easily possible right now: https://www.instagram.com/p/C0eDxUhtO8a/
If a script was made, asking an LLM to create image descriptions based on initial prompt, then used them to generate scenes, which then would be joined into at least 30-second video, should it count?
@MrLuke255 This would count as long as it met the other criteria. As far as I can tell this does not actually exist outside of demo land.
@jacksonpolack yeah I was saying they're trying to show off a variety of video styles in a short time so no long clips
@derikk @jacksonpolack and, just to be clear, we have no way of knowing how those promotional clips were made, and that sort of capabilities jump is implausible. The content users are producing with their discord seems way worse.
The gap between still DALLE-3 videos and those short clips is big, but not insurmountable, and I entirely believe those clips are AI. I've also seen better clips than those. It makes sense that AI video will master short clips where not much changes other than camera angle a year before it masters long scenes where people act purposefully, interact, etc.
@jacksonpolack I'd agree that predictions a year in advance are tough. It's these one-month predictions for which I don't see a tenable argument.
@vluzko So, you haven’t answered yet if first generating an image using for example Stable Diffusion and then using Stable Video Diffusion a couple of times would be enough to resolve YES. I assumed that yes
@MrLuke255 Are you asking if doing that will resolve the market, or could resolve the market? I would not reject that procedure (I am fine with, say, a composite model that first generates major frames and then interpolates between them). However I seriously doubt that the procedure you described would actually work.
@vluzko If you gave me time until the end of week I could try and assemble such a pipeline