
Authors Grady Hendrix and Jennifer Roberson filed a proposed class-action lawsuit against Apple on Friday, alleging the tech giant illegally used their copyrighted books to train its artificial intelligence models without permission, credit, or compensation. The lawsuit, filed in federal court in Northern California, claims Apple used a dataset of pirated books to develop its OpenELM large language models.
The timing of the lawsuit is particularly noteworthy, as it comes on the same day that AI company Anthropic disclosed it would pay $1.5 billion to settle a similar class-action case brought by authors who accused the company of using pirated books to train its Claude chatbot. Legal experts are calling Anthropic’s settlement the largest copyright recovery in U.S. history.
Apple’s Alleged Use of Pirated Content
The lawsuit specifically targets Apple’s OpenELM AI models, which power features within the company’s Apple Intelligence suite. According to the complaint, Apple built its AI capabilities using the Books3 dataset, which contains over 196,000 pirated books sourced from shadow libraries like Bibliotik. The authors claim their works were included in this unauthorized collection that Apple used for training purposes.
“Apple has not attempted to pay these authors for their contributions to this potentially lucrative venture,” the lawsuit states. The plaintiffs argue that Apple copied protected works without consent and failed to provide any form of attribution or compensation to the original creators.
Hendrix, a New York-based author known for works like “My Best Friend’s Exorcism,” and Roberson, an Arizona-based writer whose books include “Sword-Bound,” are seeking class-action status for their case, potentially representing thousands of other affected authors.
Growing Legal Pressure on Tech Giants
The Apple lawsuit represents the latest front in an escalating battle between content creators and technology companies over AI training practices. Multiple major tech firms now face similar litigation, including Microsoft, which was sued in June over its Megatron AI model, and Meta Platforms, which continues to defend against copyright infringement allegations.
The legal landscape has produced mixed results for both sides. While Anthropic agreed to its record settlement, Meta recently secured a favorable ruling when a federal judge determined that using copyrighted books to train AI models constitutes fair use under copyright law. The judge found no meaningful evidence of market dilution from Meta’s use of copyrighted materials.
However, the key distinction in many of these cases centers on how companies obtained the training data. Courts have shown more tolerance for companies that legally purchased books and then used them for AI training, while taking a harder stance against those that downloaded materials from pirated sources.
Industry Implications
For Apple, which has positioned itself as a privacy-first, user-centric technology provider, the lawsuit presents both financial and reputational risks. Analysts suggest that if courts find Apple’s AI models were trained on stolen data, the reputational damage could be even more significant than any financial penalty.
The Anthropic settlement, which provides approximately $3,000 per work to roughly 500,000 authors, may serve as a benchmark for future cases. As one legal expert noted, the settlement sends a strong message to the AI industry about the serious consequences of using pirated works to train AI systems.
With dozens of similar lawsuits pending across the industry, how courts and companies resolve these copyright disputes will likely shape the future of AI development and determine whether tech giants must secure proper licensing agreements before using copyrighted materials in their training datasets.