Will there be realistic AI generated video from natural language descriptions by the start of 2024?

202

2.3kṀ34k

resolved Jan 9

Resolved

ALL

Resolves yes if there is a model that receives a natural language description (e.g."Give me a video of a puppy playing with a kitten") and outputs a realistic looking video matching the description.

It does *not* have to be *undetectable* as AI generated, merely "realistic enough".

It must be able to consistently generate realistic videos >=30 seconds long to count.

DALL-E 2 (https://cdn.openai.com/papers/dall-e-2.pdf) counts as "realistic enough" *image* generation from natural language descriptions (I am writing this before the model is fully available, if it turns out that all the samples are heavily cherry picked DALL-E 2 does not count but a hypothetical model as good as the cherry picked examples would).

Duplicate of https://manifold.markets/vluzko/will-there-be-realistic-ai-generate

Technical AI Timelines

Resolution Pending

New Year's Resolutions 2024

Get

1,000

to start trading!

🏅 Top traders

#	Name	Total profit
1		Ṁ1,544
2		Ṁ645
3		Ṁ377
4		Ṁ376
5		Ṁ370

People are also trading

Will AI generate realistic video of animal movement before 2026?

93% chance

Will YouTube support AI-powered video summarization by 2025?

37% chance

Will AI generates film (not video) by using a person's script before 2026

50% chance

By 2029, will an AI be able to generate Video Games comparable to ~2023 'AA' Mid Market Games?

47% chance

Will an AI generated video have significant impact on US politics before 2029?

75% chance

Will most digital entertainment be AI generated by 2028?

Sort by:

Dammit seems I missed by a few months and lost tons of mana because of that.

Should have also bough a lot of yes on the 2025 market duno why o didn't do that.

My predictions for these kind of stuff keep being wrong on the seme direction by a few months, I should take that into account in the future I guess.

I sold all my remaining yes because I think nothing fits the criterion for the record. Most avaliable models don't do coherent 30+ second videos and aren't really realistic.

Duplicate of this market for 2025: https://manifold.markets/vluzko/will-there-be-realistic-ai-generate-476acd1cbfa5

Due to the size of this market I'm going to leave it unresolved for another few days. I am not going to accept or even review single videos or promotional material.

Stable Video Diffusion also doesn't produce long enough videos.

I cannot find any examples from Runway that are longer than a few seconds, so it also does not meet the length bar.

I've reviewed Pika - it does not meet the length criteria (also almost all of the examples I can find are of it editing existing video).

"Will there be realistic AI generated video from natural language descriptions by the start of 2024?"

That kinda already exists. It depends on your standards. There's one that can last an entire minute but it's not really super-coherent.

predictedYES

Does it have to be a single model or would you accept a solution which combines multiple tools, but from your perspective is a black box: receives a text prompt as input and outputs a 30 second video containing roughly what was described (a single concept or scene)?

predictedYES

What’s quite easily possible right now: https://www.instagram.com/p/C0eDxUhtO8a/

If a script was made, asking an LLM to create image descriptions based on initial prompt, then used them to generate scenes, which then would be joined into at least 30-second video, should it count?

@MrLuke255 This would count as long as it met the other criteria. As far as I can tell this does not actually exist outside of demo land.

I won’t lock my Mana for 10% so I won’t bet but I am 99.99% certain this will resolve as No

Pika and Runway are nowhere near satisfying the crtieria.

how do u define realistic enough?

@HanchiSun there is information in the description and more information in the comment threads

@vluzko So a video generator better than Dall-E 2 suffice?

https://twitter.com/pika_labs/status/1729510078959497562 impressively good

predictedNO

if it could do clips that lasted more than 3 seconds with a few simple motions, they'd show them lol

@jacksonpolack it's a 54 second trailer

which has 15 different 3 second clips in it??

@jacksonpolack yeah I was saying they're trying to show off a variety of video styles in a short time so no long clips

@derikk @jacksonpolack and, just to be clear, we have no way of knowing how those promotional clips were made, and that sort of capabilities jump is implausible. The content users are producing with their discord seems way worse.

predictedNO

The gap between still DALLE-3 videos and those short clips is big, but not insurmountable, and I entirely believe those clips are AI. I've also seen better clips than those. It makes sense that AI video will master short clips where not much changes other than camera angle a year before it masters long scenes where people act purposefully, interact, etc.

@jacksonpolack I'd agree that predictions a year in advance are tough. It's these one-month predictions for which I don't see a tenable argument.

predictedYES

https://stability.ai/news/stable-video-diffusion-open-ai-video-model

@cherrvak cool but it looks like they only generate a few frames?

predictedYES

@vluzko well do that a couple times in succession and you’ll get there

@vluzko i would probably accept this quality of video

predictedYES

@vluzko So, you haven’t answered yet if first generating an image using for example Stable Diffusion and then using Stable Video Diffusion a couple of times would be enough to resolve YES. I assumed that yes

@MrLuke255 Are you asking if doing that will resolve the market, or could resolve the market? I would not reject that procedure (I am fine with, say, a composite model that first generates major frames and then interpolates between them). However I seriously doubt that the procedure you described would actually work.