Resolved yes if it beats human music at least 50% of the time in a general audience or expert poll across at least 10 samples of different prompts (can be instrumental, no limit on genres).
Related questions
I just created a similar market with very concrete resolution criteria, with multiple options for the year that AI-generated and human-created music become indistinguishable. It resolves when, given 5 similar, full-length human-AI pairs of songs with lyrics, I wrongly guess which is which at least once. https://manifold.markets/CalebBiddulph/when-will-i-mistake-an-aigenerated
suno.ai has a pretty good model (though still can be told apart, at least when it has lyrics)

@firstuserhere after listening to some voice samples, i am not confident that a poll of random people will be able to distinguish easily.
@firstuserhere P.S. This is not to say that a panel of experts will not be able to judge it. I have much lower confidence in a poll of random people than that, though. So, essentially, this reduces to whoever sets up the poll, their design, methodology and selection of the best music models released in 2023 to these people in 2024 or beyond.

@firstuserhere that's great, the ability to hum etc, but I wouldn't bet on it until I've tried it. I try to confuse every music AI I try and they all suck unless they are given common style to generate. If the ten prompts are wacky memecore fusions, an AI will fail, the humans can figure the performance out, but if it's "Ibiza trance on the beach 2000's sunrise 130 BPM" it's just a coin toss and doesn't prove anything about the AI being generally convincing


@Nostradamnedus , this market has 450 traders by now. Would you mind making a more precise description of what your planned polling/sampling/test would look like? I think a lot of the volatility on this question comes down to different expectations/interpretations of the question description.
I.e.:
– General audience OR expert poll, seems like two quite different questions
– What does "Beats human music" mean? That people deem it better, or indistinguishable?
– Who makes the prompts?
– ...


@HenriThunberg It can be a general audience one. "Beats"= can't tell apart in a Pepsi test. Prompts would be made by whoever makes the poll I guess.

@Nostradamnedus Are you going to be the one to make the poll, or are you going to delegate it to someone else? If the latter, who?




What's the best song so far that was actually generated from a prompt, not just style conversion like the Drake songs?


@Gigacasting Both of these are only the voice, and aren't generated from a text prompt. It's impressive, but i don't think things are moving fast enough for a YES on this market

Resolved yes if it beats human music at least 50% of the time in a general audience or expert poll across at least 10 samples of different prompts (can be instrumental, no limit on genres).
Was this the original market description, or did it get changed?

@HenriThunberg helpfully confirms below that this was a change. I have some concerns about this change.
The title just asks whether the AI will be indistinguishable from humans. But the new description requires that the AI is somehow better at sounding human than actual humans are. This is a rather ridiculous bar to meet; even a human musician could only be expected to meet it 50% of the time.
As a result, this market's probability has effectively been cut in half. Even if the AI can perfectly emulate human music, it's still only going to get picked around 50% of the time. This market cannot rationally go above 50%, regardless of traders' beliefs about how good AI will become at generating music.
The new description is completely different from the title. An AI that goes up against humans and beats them 49% of the time, where the theoretical maximum is 50%, clearly satisfies the criterion of "most of the time cannot be told apart from human musicians". Yet according to this new description, it'll fail and the market will resolve NO.
This seems extremely unfair to all of the YES traders who had been betting based on the title.

@IsaacKing The title "Will a text prompt based AI music generator that most of the time cannot be told apart from human musicians be publicly available by the end of 2023?" also suggests a general ability rather than a single poll. If it's widely accessible, then the fact that the music generator only has to be >50% in a single 10-sample poll is a huge bias towards YES relative to the title. The market not being able to go above 50% implies this is a single one-off poll (maybe that @Nostradamnedus would run themselves at the end of 2023), which seems unlikely to be what they meant, and that AIs aren't able to seem more human than humans, which I think is just very wrong in general.
while this market’s theme is interesting, the resolution criteria has always seemed woefully underdefined to me.

@JacyAnthis Why would a sample poll be biased in favor of YES? You haven't explained how the distribution differs.

@IsaacKing "single" was the keyword. It's not that polls are from different distribution (though that's an issue worth considering), but that the title suggests a general capacity of indistinguishability while the resolution criterion is just indistinguishability in one little sample, potentially among many samples that find distinguishability. Let me know if that's still unclear (curse of knowledge, etc.).

@JacyAnthis That doesn't answer my question, yeah. If the sample is drawing from the same distribution as the general music industry (a strong assumption, to be sure), then a 50% success rate in the sample should translate to a 50% success rate in general. (With added variance because 10 is not a large number.)

@IsaacKing I'm making a claim that boils down to, "It's much easier for a music generator to be >50% in at least one 10-sample poll, potentially out of many, than it is to have it have one that 'cannot be told apart,' which implies something like 50% across all polls done." It seems like you're critiquing some other false claim that I'm not making, like, "A 10-sample poll is a biased estimator of the population proportion."

Related questions






