Cloudflare Now Blocks AI Crawlers by Default

▼ Summary
– Cloudflare now blocks AI web crawlers by default and offers a Pay Per Crawl program, allowing customers to charge AI companies for scraping their websites.
– AI-focused web crawlers strain servers by scraping content aggressively, prompting pushback from publishers who want compensation for their work.
– Cloudflare uses behavioral analysis and machine learning to identify and block AI bots, including undisclosed “shadow” scrapers.
– Many AI scrapers ignore the Robots Exclusion Protocol (robots.txt), with over 26 million violations recorded in March 2025 alone.
– Cloudflare’s default blocking could shift power dynamics, forcing AI companies to negotiate content licensing deals, though evasion attempts are expected to persist.
Cloudflare has taken a bold step in the battle against unauthorized AI web scraping by enabling automatic bot blocking as the default setting for its customers. The move strengthens website owners’ control over their content while introducing a novel Pay Per Crawl program that allows publishers to monetize AI company access to their data. This shift marks a significant escalation in the ongoing tension between content creators and artificial intelligence firms harvesting online information.
For years, web crawlers have played an essential role in powering search engines and digital archives. However, the explosion of AI-driven scraping tools has created new challenges. These bots often operate at overwhelming speeds, mimicking distributed denial-of-service (DDoS) attacks that can cripple websites. Beyond technical strain, many publishers, particularly news organizations, object to having their content freely harvested for AI training without compensation. “We’ve been working tirelessly to defend our content,” says Danielle Coffey of the News Media Alliance, representing thousands of North American outlets.
Cloudflare reports that over one million websites already use its existing AI bot-blocking tools. With the new default setting, millions more will gain immediate protection. The company claims its detection system goes beyond public bot lists, using behavioral analysis, fingerprinting, and machine learning to identify even undisclosed scrapers. While the traditional robots.txt protocol allows websites to block bots, compliance remains voluntary, and evidence suggests some AI firms deliberately bypass these restrictions. Tollbit, a content licensing platform, found that 26 million scrapes ignored robots.txt in just one month.
Cloudflare’s policy change could shift the balance of power toward publishers. By making blocking the standard and introducing Pay Per Crawl, the company gives content owners leverage to negotiate fair compensation. “This alters the entire dynamic,” notes Nicholas Thompson, CEO of The Atlantic. “AI firms can no longer assume free access, now they’ll need to engage in real negotiations.” Early adopters like ProRata, creator of the Gist.AI search engine, have already joined the program. CEO Bill Gross emphasizes their commitment to compensating publishers when their content fuels AI-generated responses.
The long-term impact depends on whether major AI players participate. While OpenAI and others have struck undisclosed licensing deals with publishers like Condé Nast, it’s unclear if these agreements include web crawling permissions. Meanwhile, some scrapers may attempt to circumvent Cloudflare’s defenses, as online tutorials already detail workarounds. The company reassures users that blocking remains optional, those who wish to allow scraping can disable the feature. “Every customer retains full control,” says Will Allen, Cloudflare’s head of AI control and privacy.
As the digital landscape evolves, Cloudflare’s stance could set a precedent for how publishers protect and monetize their content in the AI era. The battle over fair use, compensation, and data access is far from over, but this development signals a turning point in the conversation.
(Source: Wired)