Will there be realistic AI generated video from natural language descriptions by the start of 2024?
113
closes Jan 1
25%
chance

Resolves yes if there is a model that receives a natural language description (e.g."Give me a video of a puppy playing with a kitten") and outputs a realistic looking video matching the description.

It does *not* have to be *undetectable* as AI generated, merely "realistic enough".

It must be able to consistently generate realistic videos >=30 seconds long to count.

DALL-E 2 (https://cdn.openai.com/papers/dall-e-2.pdf) counts as "realistic enough" *image* generation from natural language descriptions (I am writing this before the model is fully available, if it turns out that all the samples are heavily cherry picked DALL-E 2 does not count but a hypothetical model as good as the cherry picked examples would).

Duplicate of https://manifold.markets/vluzko/will-there-be-realistic-ai-generate

Get Ṁ500 play money

Related questions

In 2028, will an AI be able to generate a full high-quality movie to a prompt?
ScottAlexander avatarScott Alexander
46% chance
Will we have a good uncensored AI roleplay platform?
jb421 avatarDude
55% chance
Will a text prompt based AI music generator that most of the time cannot be told apart from human musicians be publicly available by the end of 2023?
Nostradamnedus avatarNostradamnedus
23% chance
Will AI agents be used to develop software commercially by the end of 2023?
AlexMizrahi avatarAlex Mizrahi
70% chance
Will an AI-generated song receive a Grammy by 2028?
MLGaming avatarMLGaming
31% chance
In 2028, will at least 350,000 Americans (1/1000) be talking at least weekly to an AI they consider a romantic companion?
ScottAlexander avatarScott Alexander
55% chance
By 2028, will AI be able to make a full animated movie with consistent plot, design and characters with no input besides the original prompt?
ChanaMessinger avatarChana Messinger
61% chance
Will AI reliably produce images with the specified number of fingers on two hands by end of 2023?
CarsonGale avatarCarson Gale
22% chance
Will an AI-generated song be Grammy nominated by 2028?
MLGaming avatarMLGaming
37% chance
Will it be possible for AI to generate reasonably good video ads by 2024?
vluzko avatarVincent Luczkow
70% chance
Will we have end to end AI generated anime series by 2025?
firstuserhere avatarfirstuserhere
71% chance
By the end of 2027 will there be a mostly AI generated game with $1 million or more in revenue?
SneakySly avatarSneakySly
57% chance
By mid-2027, will an AI be able to generate a full high-quality movie to a prompt?
lukres avatarlukres
33% chance
By the end of 2024 will there be a mostly AI generated game with $1 million or more in revenue?
SneakySly avatarSneakySly
17% chance
Will we have end to end AI generated anime series by 2024?
firstuserhere avatarfirstuserhere
20% chance
By the end of 2025 will there be a mostly AI generated game with $1 million or more in revenue?
SneakySly avatarSneakySly
33% chance
By the end of 2026 will there be a mostly AI generated game with $1 million or more in revenue?
SneakySly avatarSneakySly
40% chance
Will AI be able to generate anime from manga by end of 2024?
roma avatarRoma
45% chance
Will generative audio tools emerge that attract over 100,000 developers by September 2023?
BionicD0LPH1N avatarBionic
14% chance
Will there be realistic AI generated video with full sound by 2024?
vluzko avatarVincent Luczkow
32% chance
Sort by:
SophusCorry avatar
Sophus Corrypredicts NO

Seriously what do people know that I don't? 3 months?

Harry_ avatar
Harry

@Harry_ obviously not long enough

YoavTzfati avatar
Yoav Tzfatipredicts NO

@Harry_ I wouldn't consider this close to realistic. I think there's still a lot of work to be done

Julian avatar
Julianpredicts NO

@Harry_ These prompts are also total softballs, even if they did look realistic it would still be insufficient since it would constitute cherry-picking the easiest cases

Ace avatar
Acebought Ṁ100 of NO

In addition to softball prompts and lacking realism, I think people are taking these "panning" and "zooming" videos way too seriously. Having multiple shots, complex actions (e.g., a puppy playing with a kitten), etc. is just clearly so much harder than the cropping or zooming of an image.

Julian avatar
Julianpredicts NO

@JacyAnthis We've had machine-generated panning and zooming that look a lot better than this for a while now, too.

Same goes for effects like fire, water splashes, glowing, etc. where there doesn't really need to be a whole lot of coordination of fine details over time.

YoavTzfati avatar
Yoav Tzfatipredicts NO

@RahulShah it says the tweet has been deleted

JustifieduseofFallibilism avatar
Justified use of Fallibilismbought Ṁ18 of YES

This convinced me AI video is coming Pepperoni Hug Spot - AI Made TV Commerical - YouTube

4 replies
YoavTzfati avatar
Yoav Tzfatipredicts NO

@TimothyCurrie That is definitely the most creepy thing I've seen this week 😆

YoavTzfati avatar
Yoav Tzfatipredicts NO

More concretely - for me, this is definitely NEGATIVE evidence for this market. The video is terrible, and I'm sure each prompt resulted in just a couple seconds of video, and the creator had to stitch them together

Julian avatar
Julianpredicts NO

@TimothyCurrie I seriously doubt that script was machine-generated. It has that same affectation as a bunch of fake "AI generated comedy" where it has the big picture correct but screws up a lot of little details, which is exactly the opposite of the mistakes that a text generator will make. GPT doesn't say things like "are you ready for best pizza of life" because those small-scale errors are the easiest to avoid. When GPT makes a mistake it's a larger-scale one, where fixing it would require looking at a much larger section of text; for example, when it repeats itself over the course of a few paragraphs.

This can be pretty easily verified by just going to ChatGPT, asking it for a pizza commercial script, and seeing that it looks nothing like this.

jonsimon avatar
Jon Simon

@Julian 100%, the text is undoubtedly human written to "sound" computer generated. But then again that's not really the point, the point is the rapidly-improving quality of the videos.

Julian avatar
Julianbought Ṁ51 of NO

Clarification: if the model fails in a large majority of cases but succeeds in some subset of particularly easy cases (probably ones that don't have a whole lot of movement outside of the camera panning) this should still count as NO, correct? Since that would be equivalent to the "cherry-picking" you describe for DALL-E 2?

i.e. the model must succeed across a wide variety of prompts

firstuserhere avatar
firstuserherebought Ṁ20 of YES

Q1 of 2023 coming to a close and we had this today:

Hugging face:

https://huggingface.co/damo-vilab/modelscope-damo-text-to-video-synthesis/tree/main

## Example clips:
Star wars clip using text to video model:
https://twitter.com/victormustar/status/1637461621541949441

Is this a llama clip xD? https://twitter.com/justincmeans/status/1637517337426550785

Shark skiing across the desert! https://twitter.com/hanyingcl/status/1637424841950392321

Astronaut riding a horse, surfing spiderman: https://huggingface.co/spaces/damo-vilab/modelscope-text-to-video-synthesis

2 replies
Gabrielle avatar
Gabriellepredicts NO

@firstuserhere It's hard to say what "realistic enough" means, but to me those are all pretty far from the bar still. There's a ton of blur and shifting motion that seems like it should be theoretically easy to fix (maybe an extra processing step?), but right now it's definitely not there.

firstuserhere avatar
firstuserherepredicts YES

@Gabrielle Oh yeah no way it's realistic enough but we're in march only :) Stable diffusion year's march felt quite similar to me