Will there be a text-to-audio tool that produces sounds of a quality comparable to real recordings, by 2024? (Musical-, voice- and synthesizer imitations excluded)

206

Ṁ3.5K

Ṁ500

resolved Jan 13

Resolved

YES

ALL

This question will resolve "yes" if there is a text-2-audio tool that produces sound which:

generates imitations of real recordings, i.e. may but must not solely generate
- musical sounds, and/or
- voice sounds, and/or
- synth-like sounds
will in regard to its quality be compared to current state-of-the-art text-2-image generators

Get Ṁ200 play money

🏅 Top traders

#	Name	Total profit
1		Ṁ46
2		Ṁ38
3		Ṁ36
4		Ṁ17
5		Ṁ16

6 Comments

37 Holders

121 Trades

Sort by:

⚠AFK Creator ; Definitive Proof

📢Resolved to YES

predicted YES

Can this be resolved @MarlonK ?

bought Ṁ50 of YES

https://google-research.github.io/seanet/musiclm/examples/ this is it folks

I rendered this song using Synthesizer V Pro, hear for yourself:
https://www.youtube.com/watch?v=nDuk47gW-_E&feature=youtu.be

This is a tool that generates a human voice that sings like a human voice. You put in the notes and the phonemes, it does the rest. This isn't just synthesizer "oohs" and "aaahs" either, it's full lyrical pronunciation.

predicted YES

It's getting closer:

https://twitter.com/FelixKreuk/status/1575846953333579776?s=20&t=kdJwocVEAtAnnyQWUwjv0A
@FelixKreuk

We present “AudioGen: Textually Guided Audio Generation”! AudioGen is an autoregressive transformer LM that synthesizes general audio conditioned on text (Text-to-Audio).

Paper: https://tinyurl.com/audiogen-text2audio/paper.pdf…

Samples: https://tinyurl.com/audiogen-text2audio…

Code & models - soon!
----
https://twitter.com/_akhaliq/status/1582825597059104769
Mubert-Text-to-Music
Colab notebooks demonstrating prompt-based music generation via Mubert API GitHub: https://github.com/MubertAI/Mubert-Text-to-Music

bought Ṁ40 of YES

Yes. The datasets to enable this already exist (AudioSet is a good start, and web scraping will also do plenty) and the tech is clearly there. See for example this 2022 "audio captioning" contest: https://dcase.community/challenge2022/task-automatic-audio-captioning-and-language-based-audio-retrieval (the tasks are not the same as generation, but they demonstrate related ideas)