Resolution Criteria:
A "high quality" movie is defined as a movie that a general audience would have a hard time distinguishing from a movie produced by a competent human today. I don't need it to be the best movie of the year, or even a top-100. But it should be something you could imagine a popular streaming service buying and putting on their platform. A "rough script" is, for example, what someone would write as a Wikipedia summary of a movie. The model doesn't have to develop the main plot points or character arcs, but it does have to fill in the details and, most importantly, create all of the visuals. A small amount of human editing and directing is allowed (e.g., in the form of advice about what was confusing, what didn't work), but the AI system must be responsible for the majority of the creative work and humans must never directly edit the visuals or audio.
Motivation and Context:
Ten years ago state-of-the-art image generators could barely produce pictures of people. Five years ago we'd figured out pictures of people's faces for the most, but struggled with complex scenes. Today we've basically figured out how to generate any image we want. On video we're maybe where we were five years ago with images: we can generate some reasonable looking scenes, but only for short clips, and even then the details are often off with objects appearing and disappearing out of thin air. I want to know if a future AI system is capable of producing a high-quality movie, which would necessitate generating coherent video over a long period of time.
Question copied from: https://nicholas.carlini.com/writing/2024/forecasting-ai-future.html