Will real-time text-to-video generation be viable by 2030?

This market resolves YES if, by January 1, 2030, I can prompt a text-to-video generator to create a 1 minute video and get a result meeting the following criteria:

  • I have to wait no more than one second for the video to load (either at the beginning or anywhere in the middle)

  • At least 1080p resolution

  • At least 24 frames per second

  • Subjectively, at least as realistic and visually appealing as the "stylish woman walks down a Tokyo street" example from Sora

  • If I wanted to, I could do this 100 times on the same day and pay less than $20 (this accounts for various pricing schemes, e.g. subscription-based or pay-per-video)

It's fine if the entire video isn't generated by the time it starts playing - it can "stream" to my device. I will make sure I have good Wi-Fi and try up to 5 times if necessary.

I will use the same prompt as the original Sora video:

A stylish woman walks down a Tokyo street filled with warm glowing neon and animated city signage. She wears a black leather jacket, a long red dress, and black boots, and carries a black purse. She wears sunglasses and red lipstick. She walks confidently and casually. The street is damp and reflective, creating a mirror effect of the colorful lights. Many pedestrians walk about.

Similar market for 2027:


Get Ṁ600 play money

More related questions