Will Anthropic, Google, or Facebook release a capable LLM that can process realtime audio like GPT-4o within 6 months?
Basic
10
Ṁ314
resolved May 14
Resolved
N/A

Whether a model is "capable" is a subjective call, but something like Llama-7B would not count, whereas something closer to GPT-3.5 would. I will ask a mod to resolve N/A if one of these companies already has a model that does this that I am not aware of.

Get
Ṁ1,000
and
S1.00
Sort by:

Is being able to output similarly natural speech to GPT-4o also required?


If not, then Gemini 1.5 Pro seems to qualify already, as @LeeWoods pointed out.

Google Gemini 1.5 Pro has had the ability to process real time audio since April 9th 2024.

https://www.theverge.com/2024/4/9/24124741/google-gemini-pro-imagen-updates-vertex

@LeeWoods Thanks, I had a feeling something like this was out there. Should have done more research.

@SirSalty Can you help resolve this market N/A