AI Companies Now Face a New Web Payment System

▼ Summary
– Major brands like Reddit, Yahoo, and Medium support the Really Simple Licensing (RSL) standard to let publishers set terms for AI scraping and training data use.
– RSL builds on robots.txt by allowing websites to add licensing and royalty terms for AI bots, not just access permissions.
– The RSL Collective, led by industry veterans, aims to create a scalable business model for web content licensing and compensation.
– RSL supports various payment models, including subscriptions, pay-per-crawl, and pay-per-inference fees for AI use of content.
– Success depends on AI companies adopting the standard, with enforcement relying on collective legal action and partnerships like Fastly for access control.
A new licensing framework is emerging to help web publishers define and enforce compensation terms when their content gets used for training artificial intelligence systems. Major platforms including Reddit, Yahoo, Medium, Quora, and People Inc. have thrown their support behind the Really Simple Licensing (RSL) standard, an open content licensing initiative designed to let publishers specify how AI developers should pay for scraping their data. This collective effort aims to give content creators greater leverage in negotiations with AI firms.
Building on the familiar robots.txt protocol, which for years has allowed site owners to control which parts of their site web crawlers can access, RSL adds a new layer of financial and legal specificity. Websites can now embed licensing and royalty requirements directly into their robots.txt files or within digital books, videos, and datasets. This move allows publishers to move beyond simple access permissions and into the realm of structured compensation.
The RSL Standard is spearheaded by the newly formed RSL Collective, led by Eckart Walther, co-creator of the RSS standard and former CardSpring CEO, and Doug Leeds, former CEO of IAC Publishing and Ask.com. According to Walther, the goal is to establish a scalable business model for the web by introducing a universal system for defining licensing and compensation rights. He describes RSL as an evolution of early RSS concepts, adapted for today’s AI-driven content economy.
RSL supports multiple licensing approaches, including free access, subscription models, pay-per-crawl fees, and even pay-per-inference arrangements. The latter is particularly significant, it enables sites to earn compensation each time an AI model uses their content to generate a response. Bots engaged in non-commercial activities, such as archiving or search indexing, can continue operating without interruption.
While some major publishers like Vox Media, News Corp, and The New York Times have already secured individual licensing agreements with AI companies such as OpenAI and Amazon, the RSL Collective aims to democratize this process. It allows smaller websites and independent creators to monetize their content without needing to negotiate one-off deals.
The success of RSL, however, hinges on widespread adoption by AI companies. Historically, some AI developers have been accused of disregarding robots.txt directives, and without industry cooperation, tracking and enforcing inference-based fees remains challenging. The RSL Collective is betting that uniting influential publishers will encourage AI firms to participate. As Leeds notes, the collective approach offers efficiency and legal clarity, companies can negotiate with many publishers at once, and non-compliance could lead to widespread infringement claims.
Unlike Cloudflare’s existing “pay per crawl” system, RSL does not inherently block unauthorized bots. To address this, the Collective is partnering with content delivery network Fastly, which will act as a gatekeeper, permitting access only to AI crawlers that have agreed to licensing terms. Publishers not using Fastly can still request compliance but lack the technical means to enforce blocking until more providers adopt similar solutions.
Leeds asserts that the RSL Collective can also handle legal enforcement collectively, distributing litigation costs across members, a model he compares to music rights organizations like ASCAP. That said, AI training data usage still occupies a legal gray area, with ongoing lawsuits involving companies like Reddit and Getty Images against major AI developers.
In a joint statement, Leeds and Walther emphasized that RSL brings much-needed transparency to web scraping: crawlers will now be explicitly notified of licensing terms before accessing content. Leeds believes RSL isn’t inventing something entirely new, it’s adapting proven licensing models to a digital environment that has, until now, lacked a unified standard.
The RSL Collective is free for publishers and creators to join, with additional support from brands like O’Reilly, wikiHow, and Ziff Davis, owner of IGN. The initiative represents a significant step toward clarifying rights and payments in an industry where content usage has often occurred without clear permission or compensation.
(Source: The Verge)
