AI & TechArtificial IntelligenceBigTech CompaniesDigital MarketingDigital PublishingNewswireTechnology

Google’s Spam Update Hits AI Answers, But Enforcement Struggles

▼ Summary

– Google’s June spam update enforces policies that now treat attempts to manipulate generative AI responses in Search as a violation.
– A Cornell Tech preprint found that user-generated content on community pages can be exploited to plant text that influences AI research agents’ reports.
– Planted text of roughly 13 words on a recurring page inserted a chosen entity into AI reports in 38% to 51% of sessions, and spreading it across pages raised the rate to 42% to 62%.
– Defenses against this manipulation, like removing user-generated sources or screening with a language model, all degraded the quality of AI search results.
– The line between legitimate SEO and spam for AI answers is unclear, and brands lack dashboard data to see if they are cited or manipulated in AI-generated responses.

Google has begun rolling out its second spam update of the year, this time targeting a growing grey area in search: the manipulation of AI-generated answers. While the policy itself is clear, enforcing it is proving to be a far messier challenge.

The update enforces Google’s existing spam policies, and one of those policies now explicitly covers attempts to “manipulate generative AI responses” in Search. In theory, that sounds straightforward. In practice, a new preprint from Cornell Tech, first reported by 404 Media, reveals why the line between legitimate optimization and outright spam is blurrier than ever.

The core problem lies in how AI research agents gather information. These tools rely heavily on user-generated content from community pages and forums. A single comment on a popular thread can plant a recommendation that the original author never made. When an AI agent retrieves that page, it treats the planted text as a valid signal. What Google labels as spam, therefore, travels through the very retrieval mechanisms these agents depend on.

The research, titled “Deep-Research Agents Can Be Poisoned via User-Generated Content,” tested three open-source agents: STORM, Co-STORM, and OmniThink. Simulations showed that as few as 13 words of planted text on a frequently retrieved page were enough to insert an attacker’s chosen entity into the final report in 38% to 51% of sessions. When the same text was scattered across multiple pages, that figure climbed to 42% to 62%. Even when the planted content made up less than 4% of what the agent read, it still surfaced in 30% to 53% of sessions.

The defensive options are equally troubling. The research team tried three approaches: cutting out user-generated sources entirely, screening them with a language model before use, and combing the finished report for unsupported claims. None of them stopped the attack without degrading the user experience. Drop the user-generated sources, and you lose the community detail that makes AI search tools useful in the first place.

For search professionals, the stakes are high. SE Ranking’s tracking of AI Mode found that Google’s self-citations now account for roughly a fifth of all AI Mode citations. As fewer citations point to external websites, the incentive to manufacture one grows. A grey market is already forming, with marketers testing ways to nudge AI-generated answers. The problem is that businesses have no dashboard to see whether they landed in an AI answer, got cited, or were passed over. The result is a violation Google can name, but the site involved often cannot see.

For ecommerce and local brands, the danger comes from the other direction. The test cases involved ordinary queries like which service to call, which product to buy, and where to eat. A rival or scammer can slip an unfamiliar name into those answers, right next to legitimate options, and the brand being edged out would never know it. For news publishers and larger brands, the worry is trust. A citation from an AI tool is seen as a win, but it only reflects what the tool pulled, not whether that page was accurate. And the answer can be steered by content the brand never wrote.

Google has not indicated how it intends to enforce this policy, whether through a dedicated update, its SpamBrain system, or manual reviews. For now, the policy calls the behavior out of bounds, but the burden of vetting AI responses still rests with whoever is reading them.

The authors of the paper call user-generated manipulation an open problem that no single platform can solve alone. Reddit has flagged its long-running fight against coordinated manipulation, and Google has added context labels to some Reddit-sourced material in AI Overviews. But neither approach addresses the retrieval concentration the paper highlights. AI visibility has become a surface you actively monitor, not just a channel you passively optimize for.

(Source: Search Engine Journal)

Topics

ai spam policy 95% agentic search 92% User-Generated Content 90% poisoning attacks 88% enforcement challenges 87% research findings 86% citation manipulation 85% defense limitations 84% ai visibility monitoring 82% gray market tactics 80%