AI & Tech Artificial Intelligence Business Newswire Technology

RSS Co-Creator Unveils New AI Data Licensing Protocol

September 10, 2025Last Updated: September 10, 2025

2 minutes read

RSS feed symbol with a red pushpin on a yellow background.

▼ Summary

– The AI industry faces up to 40 pending copyright lawsuits over unlicensed training data, including a case against Midjourney for creating Superman images.
– Real Simple Licensing (RSL) has been launched by technologists and web publishers to enable scalable data licensing, backed by major sites like Reddit and Yahoo.
– RSL includes both technical protocols for machine-readable licensing terms in robots.txt files and a legal collective for negotiating royalties and terms.
– The system allows publishers to set custom or Creative Commons terms and provides a collective option for smaller publishers unable to negotiate individual deals.
– A key challenge is tracking when specific data is used in AI training, but RSL creators believe companies can develop adequate reporting systems to facilitate payments.

The artificial intelligence sector faces a mounting legal challenge concerning the data used to train its models. Following a landmark $1.5 billion copyright settlement by Anthropic, the industry is under pressure to address how it sources and compensates for training materials. With dozens of lawsuits pending, including one targeting Midjourney for generating unlicensed Superman imagery, the absence of a clear licensing framework threatens to stifle innovation through protracted legal battles.

A new initiative aims to provide that framework. Real Simple Licensing (RSL), developed by a coalition of technologists and publishers, offers a scalable system for data licensing that could help AI companies and content creators reach mutually beneficial agreements. Already supported by major platforms like Reddit, Quora, and Yahoo, RSL introduces both technical and legal mechanisms to streamline permissions and payments across the web.

Eckart Walther, a co-creator of both RSS and RSL, emphasizes the need for machine-readable licensing agreements online. “That’s really what RSL solves,” he explains. The protocol allows publishers to specify licensing terms within their robots.txt files, clarifying whether AI firms need custom agreements or can rely on existing structures like Creative Commons.

On the legal front, the newly formed RSL Collective functions as a centralized body for negotiation and royalty distribution, drawing inspiration from collective rights organizations in music and film. This approach simplifies the process for both licensors and rights holders, especially smaller publishers who lack the leverage to negotiate individual deals.

Several prominent publishers have already joined the collective, including Yahoo, Medium, O’Reilly Media, Ziff Davis, Internet Brands, People Inc., and The Daily Beast. Others, such as Fastly and Adweek, endorse the standard without formal membership. Notably, Reddit, which already earns an estimated $60 million annually from Google for data licensing, participates in the system while maintaining its existing agreements.

A significant challenge lies in tracking usage. Unlike music royalties, which are logged per play, AI training data is often absorbed without clear attribution. Some licenses even propose per-inference payments, adding complexity to an already opaque process. Still, RSL co-founder Doug Leeds remains optimistic, noting that some AI firms already possess the tracking capabilities required for compliance. “It doesn’t have to be perfect,” he says. “It just has to be good enough to get people paid.”

The real test will be whether AI companies adopt the system. While firms like ScaleAI and Mercor demonstrate a willingness to pay for quality data, many labs still rely on free resources like Common Crawl. Distinguishing between legitimate scraping and machine-enhanced browsing remains difficult, as recent disputes between CloudFlare and Perplexity illustrate.

Yet Leeds points to public statements from AI leaders, including Google’s Sundar Pichai, advocating for standardized licensing. “They have said outwardly to everyone, something like this needs to exist,” he notes. With RSL now operational, the industry may finally have the system it claims to need.

(Source: TechCrunch)

Topics

data licensing 95% rsl protocol 93% ai training data 92% copyright lawsuits 90% ai companies 89% web publishers 88% royalty collection 87% collective licensing 86% industry standards 85% legal infrastructure 84%

RSS Co-Creator Unveils New AI Data Licensing Protocol

Topics

Stop Scaling Content, Start Scaling Success

Entity Authority: The Key to AI Search Visibility

AI in Healthcare: A Doctor’s Honest Pros & Cons

Organic Search Is Broken: How to Adapt Now

Master AI Search by 2026

The Designer Humanizing AI Interfaces

Attribution vs. Accountability: The Critical Difference

When the Internet Goes Dark, So Does the Truth

Sacred Values in Silicon Valley’s Tech Culture