AI Crawlers Blocked by More News Sites by Default

▼ Summary
– Reuters and Time switched to allowlisting AI bots in May, blocking all crawlers except pre-approved ones, joining People Inc. and The Atlantic.
– Reuters says the change hasn’t reduced traffic and has cut server costs, while executives believe it pushes AI companies toward licensing negotiations.
– Blocklists are ineffective because 30% of AI scrapes ignore robots.txt, and an allowlist helped People Inc. block over 30,000 user agents compared to 2,100 before.
– Reuters only approves bots that offer a “fair value exchange,” such as licensing payments, referral traffic, site support, or monetization benefits.
– The SPUR Coalition, now with 36 members, aims to create shared licensing standards, but smaller publishers may lack leverage to secure deals.
A growing number of major news publishers are flipping the script on AI crawlers, choosing to block them by default rather than letting them roam freely. Reuters and Time both made the switch in May, adopting allowlists that only grant access to approved bots, according to a Digiday report. They join People Inc. and The Atlantic, which implemented similar restrictions within the past year.
Reuters reports that the change has not hurt its traffic, while reducing the server costs tied to serving bots. Executives say the added friction has also pushed AI companies toward licensing negotiations, making the policy a strategic move as much as a technical one.
Why Blocklists Fall Short
The traditional robots.txt file relies on crawlers voluntarily obeying its rules. That trust is often misplaced. A Tollbit report cited by Digiday found that 30% of all AI bot scrapes ignored explicit robots.txt permissions. Blocking at other infrastructure levels still carries weight, executives argue. Scrapers that bypass these barriers pay for the workaround, and that expense is the point.
A blocklist can only stop bots a publisher already knows about. When People Inc. switched to an allowlist, the number of blocked user agents jumped from roughly 2,100 to more than 30,000. Lindsay Van Kirk, the company’s svp of innovation, shared those numbers at an IAB Tech Lab event in late May.
That scale aligns with broader data. A BuzzStream analysis we covered in January showed that 79% of top news publishers block at least one AI training bot. Meanwhile, Anthropic now warns publishers that blocking its search bot can hurt their visibility. In the UK, a new conduct requirement forces Google to let websites opt out of AI search features entirely.
How Publishers Decide Which Bots to Let In
Default-deny flips the decision: instead of choosing which bots to block, publishers decide which bots to allow. Reuters approves a crawler only when it offers a “fair value exchange,” Josh London, head of Reuters Professional, told Digiday. That exchange covers four types of value: licensing payments, referral traffic, site maintenance support, or monetization help.
The live Reuters robots.txt file reflects this approach. It lists approved crawlers from Amazon, Google, Bing/Microsoft, Yahoo, and OpenAI, then blocks all other bots from most of the site.
Why This Shift Matters
Crawler access has operated the same way since robots.txt was invented: every bot gets in unless a publisher explicitly names and blocks it. Reuters and Time are now reversing that default. The People Inc. figures show why. You cannot block a bot you have never heard of.
Blocking comes with trade-offs, though. Stop a crawler, and you lose whatever it was sending back, such as AI search visibility or referral traffic. That is why both publishers ask what each bot gives them before granting access. It is a question worth asking about your own robots.txt.
What Comes Next
These publishers are betting on collective leverage. One site blocking AI bots is easy to ignore. The SPUR Coalition is building shared standards for licensing and content use. It grew to 36 organizations this month after adding 30 members. Thirty-six publishers blocking together is harder to dismiss than one.
What remains unclear is who this strategy ultimately works for. Reuters came to the table with a newswire business and licensing deals already signed. Smaller publishers face the same choice without that leverage. They can block, but blocking costs AI visibility and does not guarantee anyone will show up to negotiate.
In a deep dive I wrote a few months ago, I found that payment pools remain small relative to traditional search revenue. If deals only come in for the biggest names, default-deny could stay a big-publisher tool.
(Source: Search Engine Journal)




