By 2026, the SOTA in image generation will be using a voice chat to control the generation.
2
29
90
2026
49%
chance

The inputs from the voice chat must be about controlling the content generated in the image.

Voice chat must be the main input-mechanism to control the generation of images in some SOTA tool. (Like prompting was the dominant way in 2022)

Get Ṁ600 play money
Sort by:

Not sure about this one

Just out of curiosity, what do you see as the purpose of this kind of architecture? Do you believe that voice input will enhance the quality of generation?

@2eb7 e.g. accessibility to novice users, speed of interaction with the system, using all information in the chat history, having the model ask for clarification