Will human-level automated (AI or otherwise) music transcription exist in 2024?

A trained human can listen to a music recording and transcribe it into written music more or less accurately, assuming the recording is of good quality and the notes are actually discernible. Even if they are not perfectly discernible, good guesses can be made from context and knowledge of the genre.

Will an automated system, AI or otherwise, be able to do this, to roughly human-level accuracy, in 2024?

To resolve this market YES, the system should be something that I or someone I trust can access in order to use it. It's OK if it's not fully public or if it costs money, as long as verification can be done and we're not just trusting results from the developers. It must be a general-purpose music transcription system or more general AI model or whatnot, not something created specifically for this question.

Unless there is good reason not to, I will test using the following djent song, which is reasonably complex, but not ridiculously so, and for which there doesn't appear to be written music or guitar tabs available online:

If I have reason to believe this is not a representative example (e.g. because the written music does appear online and it's plausible the transcribing system was trained on it), I may choose a different song of similar complexity/fidelity.

It is a bit subjective, but we're looking for trained-human-level performance. I believe a good musician familiar with the genre would be able to transcribe the above well enough that they'd be able to perform or record a cover of it without it being noticeably different (in terms of the notes - ignoring tone/timbre). There are some quiet background notes that I have a hard time hearing, it's not important that these are 100% correct, as long as what is output is reasonable and performing the same function in the song, similar to if a human was having to interpolate a bit due to not all notes being discernible. They should be in the right key, have the same feel, etc, even if they're not identical (not that we have ground truth to compare to in any case).

As judgement may be a bit subjective, I won't bet in this market.

I have no idea if anyone is actually working on something like this. Are they?

@WieDan Not that I know of. Though it seems reasonable to me (definitely a non-expert) that it's a similar problem to voice-to-text, so I'd guess if anyone tried to implement a music transcription AI model they'd quickly have some reasonable amount of success given where voice-to-text is.

Maybe not much of a use case for it though.

@chrisjbillington Yes very niche use case, don't know if there is a profit in it. I think this is something that's within reach of current state AI but someone should still actually build it. So the bet isn't just about the technology here

Based on the song selected, should I assume this is just about transcribing music and not also lyrics? On a separate note, I like the song you used as an example, I haven't been listening to that much prog metal recently, but maybe I should get back to it.

@TimDuffy Ah, I hadn't thought about that but yes, just music not lyrics.

Yeah this song is rad!

