Anthropic Secures Partial Victory in AI Training Data Copyright Case

2025-06-26

Key Legal Ruling on AI Copyright Training

  • U.S. federal court judges determine Anthropic's AI training on copyrighted books constitutes "transformative" fair use under copyright law
  • Judge explicitly warns against permanent storage of millions of pirated books as copyright violation
  • OpenAI and Meta face similar lawsuits from authors over unauthorized use of copyrighted works for AI model training

Anthropic secured a pivotal legal victory in the ongoing copyright dispute regarding AI training methods using protected materials, though the broader legal battle remains unresolved.

U.S. District Judge William Alsup ruled in Monday's decision that Anthropic's use of copyrighted books to train its Claude AI chatbot qualifies as fair use under U.S. copyright law. The judge emphasized the transformative nature of AI training processes.

"Just as any aspiring writer builds upon existing works, Anthropic's LLM training aims to create entirely new outputs rather than replicate existing content," stated Judge Alsup in the ruling.

However, the judge also rebuked the Amazon- and Google-backed company for maintaining a vast collection of pirated books, calling this aspect of their operations a clear copyright violation.

No Copyright Exceptions for AI

The case was initiated in August 2023 by authors Andrea Bartlett, Charles Graeber, and Kirk Wallace Johnson, who alleged Anthropic built its Claude AI using over 7 million pirated books downloaded from notorious sources like Library Genesis and Pirate Library Mirror.

Plaintiffs seek compensatory damages and permanent injunctions, claiming Anthropic created a multibillion-dollar business by "stealing tens of thousands of copyrighted books" to train its AI models.

Alsup acknowledged the transformative potential of AI training while emphasizing that maintaining permanent digital libraries of pirated works could "devastate academic publishing markets." Internal documents revealed Anthropic employees' intention to create a "digital archive of all books" for perpetual preservation.

Monday's ruling marks the first major U.S. federal court analysis of fair use principles specifically applied to generative AI model training with copyrighted materials. The court distinguished between training copies (deemed fair use) and retained pirated copies (subject to further legal proceedings including potential damages).

Broader AI Copyright Litigation

While multiple high-profile cases against OpenAI and Meta remain in early stages with motions to dismiss pending or discovery phases ongoing, similar copyright disputes are intensifying:

  • New York Times sued OpenAI and Microsoft in 2023 over unauthorized use of millions of articles
  • Reddit filed a lawsuit against Anthropic alleging over 100,000 platform data extractions occurred after official cessation
  • Author groups have launched coordinated litigation campaigns against both OpenAI and Meta for unauthorized training on copyrighted works

The Alsup ruling establishes important legal precedents while leaving critical questions unresolved about the boundaries of AI copyright compliance. Judge Alsup's decision explicitly cautioned against creating systemic exceptions for AI companies in copyright law.