AI & TechArtificial IntelligenceBigTech CompaniesNewswireTechnology

Anthropic Used Millions of Print Books to Train Its AI Models

▼ Summary

– Anthropic spent millions cutting and scanning print books to train its AI assistant Claude, discarding the originals after digitization.
– The company hired Tom Turvey, a former Google Books executive, to replicate Google’s legally successful book-scanning strategy.
– Anthropic’s destructive scanning was unusual due to its large scale, prioritizing speed and cost over preserving physical books.
– Judge William Alsup ruled the scanning as fair use because Anthropic legally bought the books, destroyed them post-scanning, and kept files internal.
– The judge deemed the process transformative, but Anthropic’s earlier piracy weakened its legal standing for a precedent-setting AI fair use case.

Artificial intelligence company Anthropic invested heavily in physical book scanning to train its Claude AI system, according to newly uncovered legal documents. The process involved purchasing millions of print books, removing their bindings for efficient scanning, then discarding the original copies, a controversial method that recently received judicial approval under specific conditions.

Court filings show the company made a strategic hire in early 2024, bringing on Tom Turvey, who previously led Google’s book scanning partnerships. His assignment was ambitious: secure access to virtually every published book available. This move mirrored Google’s own large-scale digitization efforts, which had previously withstood legal challenges and helped shape copyright law regarding fair use.

What set Anthropic’s operation apart was both its enormous scope and its irreversible approach to digitization. While destroying books after scanning isn’t unprecedented for smaller projects, the systematic elimination of millions of physical copies raised eyebrows. The company prioritized speed and cost-efficiency over preservation, judging the trade-off worthwhile for advancing its AI training objectives.

The legal ruling from Judge William Alsup established important boundaries. He determined the scanning qualified as fair use, but only because Anthropic met three critical conditions: purchasing the books legally, destroying just one copy per scan, and restricting digital access to internal research. The decision compared the process to legitimate format conversion for space conservation. However, the judge noted that earlier copyright violations by the company weakened what could have been a landmark case for AI development.

(Source: Ars Technica)

Topics

anthropics book scanning ai training 95% fair use ruling by judge william alsup 95% destructive scanning method 90% legal strategy hiring tom turvey 85% impact copyright law ai development 80%
Show More

The Wiz

Wiz Consults, home of the Internet is led by "the twins", Wajdi & Karim, experienced professionals who are passionate about helping businesses succeed in the digital world. With over 20 years of experience in the industry, they specialize in digital publishing and marketing, and have a proven track record of delivering results for their clients.
Close

Adblock Detected

We noticed you're using an ad blocker. To continue enjoying our content and support our work, please consider disabling your ad blocker for this site. Ads help keep our content free and accessible. Thank you for your understanding!