Will there exist an AI music generator at least as good as DALL-E 3 by the end of 2024?
Basic
166
67k
2025
67%
chance

DALL-E 3 successfully makes images that look photorealistic, or in a certain style. It doesn't always understand the prompt, meaning it may draw something different than what the creator intended. But given a certain image, it could easily pass for a photo or drawing made by a human.

Same criteria for music. It must be able to make songs that sound like they could have been popular human-created songs. Instrumental only is fine, it doesn't need to be able to do lyrics. But it does need to be able to handle an entire song, not just a clip a few seconds long.

Resolves according to my judgement, I won't bet. I like a lot of instrumental music, so if there exists an AI music generator that I like enough to replace my current human playlists, that would likely be good enough. If there isn't one, and the reason isn't because it's just too expensive for me, that means this'll probably resolve NO.

Get Ṁ600 play money
Sort by:

Isn't it suno?

These generators are getting pretty good; while there are still some issues around exact style and following more detailed instructions, DALL-E 3 also has similar issues. (It can't even draw a pentagon.) So I'm learning towards resolving YES.

(I think Jacy was expecting me to have more refined musical tastes than I usually do; my current style of listening is to pick some random instrumental Youtube compilation and let it autoplay, and the stuff I can compose on Suno is good enough that I think I'd find it equally pleasant as background music.)

Pushing me away from a YES resolution is the fact that it seems these AI songs are still very short (~2 minutes), and don't have a proper ending, they just cut off. They also tend to be pretty consistent in their pacing, without changes in cadence or style. I think the best AI music I've heard yet is from the Fooming Shoggoths, which would definitely be enough to qualify, but that had a lot more human involvement.

How about we do a blind test like Jacy suggested? Here's a proposal, I'm open to modification:

Henri (or another volunteer for the YES team) creates 5 prompts to give to any combination of AI music generators of their choice. For each prompt they're allowed to generate up to 3 songs and pick the best one. They're also allowed to use any simple options that come built-in to the generator, like setting the style or clicking an "extend" button. They're not allowed to do anything more complicated like regenerate a chosen part of the song or edit it using an external program.

Jacy (or another volunteer for the NO team) finds 5 songs that are entirely human-created, with no AI-assistance. They can't be any longer than 4 minutes each, and I'll disallow anything super weird or creative that's trying to stand out from what an AI would do, like 4'33 or a song that uses only video game sound effects. It must be "normal music". It also can't be anything I've heard before and would recognize.

All music from both teams must be entirely instrumental with no voices, and all 5 should be in significantly different styles. I'm given a file with all 10 songs in a random order and no clues to which is which. I get to listen to them as many times as I want before making my final guess for each. If I get at least 9 of them right, this stays open, otherwise it resolves YES.

@IsaacKing That's a clever resolution mechanism to operationalize this!

I do wonder, though, if Dalle-3 would actually fail this if you tried the same thing? Like, some of the things you could use to detect that the music is AI generated is more about some limits that Dalle-3 kind of has analogous limits to? For instance, it creates images at a certain resolution (and typically squares - though the API might be able to do otherwise?). And maybe there would be sort of key things to look for that would make it so you'd manage to categorize the AI images, especially if the human-created images were specifically picking cases where Dalle is known to be weaker?

I don't have much of a stake in this question, so I'm not pushing back very hard here! But that's what I was thinking about when you described it.

@ChrisPrichard Yeah, that's a fair point. The aspect ratio is easily changed by cropping the picture, but resolution is a real limitation. I aimed to be fair to Suno by limiting the songs to 4 minutes, which I think is similar to a resolution limit. But yeah, I think the current ones are definitely very close.

@IsaacKing I like the blind test, but I'd also venture that if a YES resolution is warranted, then presumably you should replace your current human playlists. My guess is that if you do that with May 2024 music generators and start listening to them as often as you listen to those current human playlists, you will quickly tire of it.

On the test, could you first share a playlist of instrumental music you like so both 'sides' could approximately match that? There are many different types of instrumental music, and few people enjoy all of them. In that case, I'd be willing to put in the time to gather 5 such songs.

@Jacy The reason I haven't started listening to AI music is primarily logistic; I'm not aware of any service that will constantly play hours of new music for me for free like Youtube does. If I were able to test it out it's possible I'd discover I liked it less than Youtube, not sure. I can only tell so much from short clips.

Do my personal preferred genres matter? I think the test is equally fair as long as both parties are aiming for the same thing. If the intention is that you think I'd be better at distinguishing human from AI if it's a genre I'm familiar with, that might be true, but seems a little unfair, since I'm not an artist or art connoisseur and don't have that advantage when looking at DALL-E's outputs. (Though I did play some "human or AI" art games with my partner who is a professional artist, and she wasn't significantly better at them than me.) Also I think it would also be less time-intensive for both parties if they're able to draw from the largest pool of possible music rather than trying to match something more narrow.

@HenriThunberg, are you interested as well?

@Jacy I'll put time into responding with suggestion on test modifications. Want to get back constructively when I do. But yes, interested!

But wanted to quickly write that I think one of the things I am most skeptical about is the fairness of picking a genre that Isaac is most famoliar with. My experience on AI music is that it's easier there than elsewhere, and doesn't seem like it should be part of the test.

Will get back with more thoughts.

@IsaacKing I am quite confident that your complaints below have been resolved with both v3 of Suno (which I don't think you've yet commented on for this post) and/or Udio.

  • Song length: Suno makes 2 minute clips by default, and you can extend them if you want. Udio clips are made to be stitched together beyond 30 sec, and very user-friendly to do so.

  • Consistently instrumental: Both apps have options to remove lyrics, that both seem very reliable to me.

  • Guitar solo example: All my 2x2 first generations on Udio/Suno gave me decent examples of this on my first try.

  • Nonsensical lyrics: I don't think they're great, but not non-sensical. Don't know to what extent this improved by a lot between Suno v2 and v3. Regardless, both services offer the option of adding your own lyrics which to a very large degree solves this.


Since this all relies on your opinion, I think this market would really benefit from new concrete

goalposts, in the style of your comments below. Alternatively, a YES resolution now if you don't have such complaints. Otherwise, this risks becoming much more of a "What will Isaac think" market rather than a "What are AI music generator capabilities by EOY 2024" prediction market.

bought Ṁ10 NO at 72%
bought Ṁ10 NO

@HenriThunberg I think you're underestimating the bar that @IsaacKing has for a YES resolution. This is fairly concrete: "I like a lot of instrumental music, so if there exists an AI music generator that I like enough to replace my current human playlists, that would likely be good enough. If there isn't one, and the reason isn't because it's just too expensive for me, that means this'll probably resolve NO."

Despite the impressive capabilities of 2024 models, I really doubt any of them are doing well enough to replace Isaac's current human playlists. Do you think that's even plausible? Personally, I also really don't think they "sound like they could have been popular human-created songs." I'm confident blinded tests right now would have no trouble distinguishing human- and AI-created songs.

@Jacy I don't think we're necessarily in disagreement about where things stand, your comment largely makes sense to me. But yes, I think there would already be two feasible examples for AI playlists replacing current ones at least for focus music or background dinner party elevator jazz: A) a human could already curate a playlist of purely AI-generated songs that would be good enough.
B) automatically rank tracks on the platform by popularity within a certain genre.

To this I'd like to add that I think the current generators are more impressive than I would have expected in 2024. E.g. inpainting (select a portion of the clip that you want to make new generations for, with new instructions) which was just launched by Udio seems to be a type of DALL-E 3 functionality not considered in this market. I wonder whether that helps resolution at all.

Anyway, what most concerns me is that the goalposts we're aiming for are quite vague and getting clearer tests (like Isaac's original complaints above) would help with that.

bought Ṁ150 YES
bought Ṁ10 NO at 71%

@IsaacKing any chance we could get a YES resolution on this already? Would be great to liberate some mana for charity donations this week 🙌

Of course also happy to hear a NO opinion on Suno and Udio not being good enough for your criteria, I personally think they are (and have obviously betted thereafter).

bought Ṁ10 NO at 70%

New text-to-song app has been released - though their servers got overloaded with new sign ups..

Keen to hear your thoughts @IsaacKing

https://www.udio.com/

Soul Synthesis

Instrumental, "high-energy, lively, infectious brass, sax, and drum music that blends modern jazz, funk, dance, electronic and hip-hop"

Generated entirely with Suno v3

bought Ṁ10 NO from 72% to 71%
bought Ṁ20 NO
bought Ṁ1,000 YES

Alright, I’m doubling down.

Same criteria for music. It must be able to make songs that sound like they could have been popular human-created songs. Instrumental only is fine

Suno's newest version easily makes songs that are convincing and straight up slap now.

2 traders bought Ṁ25 NO
bought Ṁ20 YES at 74%
bought Ṁ50 YES

I don't like betting huge amounts on subjective markets, but this is clearly massively undervalued post Suno v3

bought Ṁ15 YES

@IsaacKing Can I suggest you check out Suno V3 and let us know your thoughts?

https://youtu.be/k89DKrBtRaQ?feature=shared

@OneGuy @IsaacKing link to Suno for you to experiment and share your feedback with this market. Not sure if V3 is free or paid though - https://app.suno.ai/create/

predicts YES

@IsaacKing I'd love to get your thoughts on suno.ai (as others have mentioned) and why it does or does not qualify.

@JRR Well I just asked it for an instrumental song, and it gave me one with lyrics. DALL-E does make blatant errors from time to time, but not on things that are as fundamental to most images as "instrumental vs. lyrical" is to songs.

e.g. if I ask it for a watercolor image, it gives me one.

Hmm, it did successfully give me a nice Ragtime piece.

Eh, but now I've tried multiple ways to get it to give me an electronic guitar solo and it just keeps giving me pop/rock songs again, no guitar solo in earshot.

I'd say this is about on par with DALL-E 2, maybe a little worse. That's very subjective of course.

bought Ṁ10 NO from 74% to 73%
predicts YES

@IsaacKing there are certain ways you need to prompt it to do things with "metatags", like a guitar solo. If the question is "is it as user friendly" or "does it minimize errors" like the much more polished (and financed) dall-e, definitely not. If it is "can it make a convincing song of music that would make most people think it was created by humans" then the answer would be yes, imho.

predicts YES

@IsaacKing if you go to the "explore" tab, you can hear lots of examples of songs people created

@JRR I would say it fails on most songs due to the nonsensical lyrics, and I can't reliably get it to stick to instrumental ones. It also can also only generate ~1 minute clips, not full songs; the ones on the explore page were edited into longer clips by humans if I understand correctly?

bought Ṁ10 NO at 73%
predicts YES

@IsaacKing try using [instrumental] as a tag. Not sure if it will do a whole song that way, probably depends on style of music. No, to get a whole song, you just select "continue" from the clip and it will generate the next portion. It does it this way for economic reasons (so you use more credits).

I think lyrics were pretty good in my experience. Here is an example of a song I did and lyrics made a lot of sense and worked well. https://m.soundcloud.com/dj-frag-210072481/dj-frag-summer-nights?si=112a180aae1d4326ae9bb309ea0b57e1&utm_source=clipboard&utm_medium=text&utm_campaign=social_sharing

predicts YES

@IsaacKing also, after you "continue" it a few times and have a whole song, you just select "get whole song" and it puts it together for you (no credits required to put it all together)