Will I have a convincing video call with a fully synthetic AI avatar before July 2025?

Will I have >5 minute conversational video call with a fully synthetic AI character or digital avatar, with synced audio and video, which is somewhat lifelike (even if not perfect) and with conversational latency roughly comparable to human conversation before July 2025?

Other considerations:

  • The video should be at 720p resolution or better. The audio should sound lifelike and natural without many artifacts

  • The video and audio must be fully synthetic. They cannot be deepfakes or alterations of existing footage, though it's alright if they're prompted by pictures or short audio snippets.

  • There should be decent lip syncing. Even if I can see artifacts or flaws, the quality should be high enough that the median American wouldn't notice if they weren't paying much attention.

  • The latency can be slightly higher than an average human conversation, but it should be better than or roughly equivalent to the convenience of a normal video call with a spotty connection.

  • The conversational quality of the model should be equivalent to GPT-4o or better

  • The product, if it exists, must be available to American citizens, although it's fine if there are KYC requirements or a paywall under $100 to use.

