Will OpenAI be forced to license some portion of the copyrighted works used as training data?
49
286
1K
resolved Dec 14
Resolved
YES

OpenAI has taken the position that training an large language model (LLM) on copyrighted works falls within the "fair use" doctrine. Those who own the copyrights on these works disagree and several court cases are in progress.

This question will resolve to "yes" if any US jurisdiction rules against OpenAI, requiring them to license some amount of their training data. It will also resolve to "yes" if OpenAI reaches an agreement, of their own accord, wherein they agree to pay for some amount of their training data.

An out-of-court settlement with an intellectual property owner is a precedent of a sort but is clearly stretching the meaning of "licensing", it will not count as a "yes".

Get Ṁ1,000 play money

🏅 Top traders

#NameTotal profit
1Ṁ1,206
2Ṁ228
3Ṁ125
4Ṁ113
5Ṁ54
Sort by:
predicted NO

Damn I should've more carefully read the description. The title+description at the time of my bet made me think it would have to be licensed in the context of the US legal system. Why have it as a criteria if any international licensing will resolve yes?

predicted YES

Yep, this counts. 🙂 Thanks you to @e_gle for finding these articles!

sold Ṁ1,100 of YES

@cmiles74

📢Can you chime in, I see you were active in discussion on Dec 9th in this market.

📝Can this Resolve?

bought Ṁ400 of YES

Resolves YES @cmiles74 , see below discussion. OpenAI has reached an agreement with Axel Springer which (among other things) includes payment for use of data in training.

bought Ṁ100 of YES

https://openai.com/blog/axel-springer-partnership

This is confirmed with the new agreement between OpenAI and German publishing house Axel Springer

bought Ṁ120 of NO

@e_gle oooh, let's look further to see if the licensing deal included anything about the articles being used in training data, but from what they're including in the article, this isn't about training data. It sounds like run-time access to the data.

predicted YES

@chrisjbillington https://on.ft.com/3GHVzEU

This article provides additional detail including this quote indicating past data will be used for training.

"Axel Springer will receive a one-off payment for its historical content that will be used to train the AI technology for the first time, but the larger fee will be paid under an annual licence agreement that will allow OpenAI to access more up-to-date information."

sold Ṁ333 of NO

@e_gle Excellent. Sounds like that counts then!

predicted NO

@e_gle @cmiles74 fwiw I don't think this should have counted, because the question was whether or not OpenAI would be forced to license training data, not whether it would voluntarily do so. AFAIK this wasn't a settlement from a lawsuit or anything, just a voluntarily agreement, and given that OpenAI is (according to the FT) paying more for up-to-date data than historical data, it sounds like the purpose of this agreement is more to get up-to-date data instead of licensing data they've already trained on.

bought Ṁ100 of YES

@PlasmaPower the details of the market state it will resolve yes if OpenAI signs an agreement "of its own accord"

predicted NO

@e_gle Ah I missed that, I thought it was just if they lost or settled a lawsuit. Makes sense then, I'll read markets more carefully in the future :)

@cmiles74 would it count if such a thing happened before market creation? Or only things from after Oct 8th when you created this market?

predicted YES

@chrisjbillington We should withhold judgement until the suit brought by the Authors Guild wraps up. It's a big case and I think everyone is hoping it brings clarity to this issue.

@cmiles74 that doesn't really answer my question - if e.g. openAI signed a licensing agreement (unrelated to this suit) with someone prior to when you created this market, would that count?

predicted YES

@chrisjbillington Do you have an article or something about this agreement? This sounds really interesting!

@cmiles74 Can you tell me whether such a thing would count for a YES resolution, if it existed, before I tell you whether I think such a thing exists or before I go looking for one?

e.g: "OpenAI signs licensing deal with NYT, will pay undisclosed sum for use of copyrighted NYT articles".

Made-up example. Would that count?

predicted YES

@chrisjbillington My knowledge is limited, I am not an expert in this field. I did some research before creating this market. In my opinion, if there was such a licensing agreement it was likely too narrow, hence the lawsuits from the Authors Guild and the block of nonfiction authors.

No, an earlier licensing agreement will not count.

@cmiles74 Great, thanks!

This is the one I found, it was from before market creation:

https://apnews.com/article/openai-chatgpt-associated-press-ap-f86f84c5bcc2f3b98074b38521f5f75a

ChatGPT-maker OpenAI signs deal with AP to license news stories

ChatGPT-maker OpenAI and The Associated Press said Thursday that they’ve made a deal for the artificial intelligence company to license AP’s archive of news stories.

“The arrangement sees OpenAI licensing part of AP’s text archive, while AP will leverage OpenAI’s technology and product expertise,” the two organizations said in a joint statement.

Financial terms of the deal were not disclosed.

predicted YES

@chrisjbillington Thank you, Chris! This is a really interesting case, particularly where they imply a two-way deal: OpenAI gets access to the AP news archive and the AP appears to get some level of access to OpenAI products.

predicted YES

My thinking is that we'll resolve this question to "no" if we a couple lawsuits in OpenAI's favor.

Is the close date of the market fixed, or does it extend as long as there are ongoing lawsuits? I'd be inclined to bet NO for this year, but YES on a longer horizon.

bought Ṁ10 of YES

Added $100 mana subsidy to start.

@firstuserhere what does this mean? I have been seeing MANA figures next to questions and I am not sure I understand what this means. Can you explain please?

Does “forced to license” mean a court decision rulling that Open AI’s activities amount to copyright infringement? And does this question apply to all and any jurisdiction or only to the US?

predicted YES

@andyou Good question! This question only applies to the US jurisdiction and refers to any licensing agreement. That is, if OpenAI comes to any agreement with Hachette Book Group where they pay Hachette money for licensing any amount of Hachette's intellectual property then this would count as being "forced to license".

@cmiles74 What if OpenAI financially settles in some case without admitting fault?

predicted YES

@osmarks If OpenAI settles with a holder of intellectual property through a financial settlement, that will suffice as a "yes" for this question. I hope it doesn't turn out this way!

In my opinion the real aim is to figure out if training data needs to be licensed or if it falls under the fair use doctrine. If OpenAI settles with a IP holder, that will be unfortunate as it will not provide a clear precedent for other companies in this space. Even so, I believe that if OpenAI settles with one or more intellectual property that will indicate to the market that some payment needs to be provided to IP holders in order to train on their data. IP holders will expect some kind of payment.

What do you think, @osmarks?

@cmiles74 I haven't yet determined what I think will actually happen, though I would be somewhat annoyed if it did require licensing.

predicted YES

@osmarks I was thinking more along the lines of what's your opinion on how this question should resolve if they settle with an IP holder out of court. 😉

@cmiles74 Probably not. That would not produce a precedent.

predicted YES

@osmarks I'm going to leave out-of-court settlement as a "yes" for now and think about it some, I'll update the question description this afternoon. I suspect that if Hachette manages to squeeze dollars out of OpenAI, they will be banging on Google's door next. But it's somewhat muddy and not as clear as a court decision.

predicted YES

@cmiles74 Agreed, a settlement is not a clear precedent and is definitely not "licensing" anything.