Apple Sued for Using Pirated Books to Train AI

▼ Summary
– Apple is being sued by authors Grady Hendrix and Jennifer Roberson for allegedly using their copyrighted books without permission to train its AI models.
– The lawsuit claims Apple used the pirated Books3 dataset, which contains over 196,000 books, to train its OpenELM and possibly other language models.
– Books3 was taken down in 2023 after a DMCA request, but it had been used by several AI companies, including Meta, for training purposes.
– The plaintiffs seek class action status, an injunction to stop further infringement, and monetary damages for the alleged copyright violations.
– Similar lawsuits have targeted other AI companies, such as Anthropic, which settled for $1.5 billion, and Perplexity and OpenAI also face copyright infringement claims.
Apple faces a new legal challenge as authors accuse the company of training its artificial intelligence systems on illegally obtained books. The lawsuit, filed in a Northern California federal court, alleges that Apple used the controversial Books3 dataset without permission, raising significant questions about copyright and AI development practices.
Authors Grady Hendrix and Jennifer Roberson claim Apple relied on their copyrighted material to train its OpenELM language models, pointing to details in the company’s own research publications. According to the complaint, Apple incorporated Books3, a collection of more than 196,000 pirated books, into its training pipeline, likely using the same data for its Foundation Language Models as well. The plaintiffs argue this was done without their consent and without offering compensation.
Books3, which included works from numerous authors, became a widely used resource among AI developers before being removed in 2023. A Danish anti-piracy organization successfully petitioned for its takedown using a DMCA request. Despite this, several major tech firms, including Meta, are known to have utilized the dataset for AI training prior to its removal.
Hendrix and Roberson are seeking class-action status for their lawsuit, which would allow other affected authors to join the case. They aim to prevent Apple from continuing to use pirated materials and are pursuing financial damages for the alleged infringement.
This case is part of a broader trend of legal actions targeting AI companies over training data sources. Earlier this year, Anthropic, creator of the Claude AI, agreed to a landmark settlement of $1.5 billion with authors whose works were used without authorization. That case also involved Books3, with claims that over half a million books were infringed. Other firms, including Perplexity and OpenAI, are confronting similar allegations.
The outcome of this lawsuit could influence how AI developers source training data in the future and may establish important legal precedents regarding intellectual property in the age of machine learning.
(Source: PC Mag)





