
Does training LLMs or LTGMs on copyrighted material violate copyright?
19
Never closes
Yes
No
Specifically, do you think it violates US copyright law?
Companies like OpenAI and Stable Diffusion claim that their use of copyrighted material falls under Fair Use because it is transformative, provides many benefits to society, and would not be possible with only public domain training content.
Opponents claim that the scraping process does not fall under Fair Use because the purpose is commercial, creative, and harms the market for the original works. The training process likewise may be considered a copyright violation when the network weights end up containing the training data in a way that can be output verbatim.
This question is managed and resolved by Manifold.
Get
1,000 to start trading!
People are also trading
Related questions
Will the legality of AI training on copyrighted works be settled by, and in favor of, the American Copyright Lobby, before 2026?
25% chance
By 2025 end, will it be generally agreed upon that LLM produced text/code > human text/code for training LLMs?
11% chance
By 2027, will it be generally agreed upon that LLM produced text > human text for training LLMs?
62% chance
By 2029 end, will it be generally agreed upon that LLM produced text/code > human text/code for training LLMs?
77% chance
Will an LLM do a task that the user hadn't requested in a notable way before 2026?
91% chance
Illegal Agent-like LLM which automatically serves up links to copyrighted texts available by mid 2026
29% chance
Will Generative AI trained on crawled art be illegal in 2027 because of copyright?
18% chance
Will any widely used LLM be pre-trained with abstract synthetic data before 2030?
72% chance