Google Updates Robots.txt Docs, Sets Deep Link Rules Amid EU Scrutiny

▼ Summary
– Google published three best practices for “Read more” deep links: content must be immediately visible, use H2 or H3 headings, and snippet text must match page content.
– Google may expand its robots.txt documentation to include the top 10 to 15 most-used unsupported rules and possibly accept more typos for “disallow.”
– The European Commission proposed that Google share ranking, query, click, and view data with rivals, including AI chatbots qualifying as search engines under the DMA.
– Google introduced hotel price tracking for individual hotels and the ability to launch AI agents from AI Mode, shifting user tasks onto its own surfaces.
– The week’s theme is that rules are being formalized, making audit criteria clearer but removing ambiguity as a defense for non-compliance.
Google has updated its robots.txt documentation, introduced new guidelines for deep links in snippets, and begun rolling out task-based search features, all while the European Commission pushes for broader search data sharing with rivals and AI chatbots. Here is a breakdown of what these changes mean for your SEO strategy and day-to-day operations.
Google Clarifies “Read More” Deep Link Best Practices
Google’s revised snippet documentation now includes a dedicated section on “Read more” deep links within search results. To improve the chances of these links appearing, Google recommends three core practices. First, ensure that content is immediately visible to a user when the page loads. Hiding key information behind expandable sections, tabs, or scroll-triggered events can reduce the likelihood of deep links. Second, use H2 or H3 headings to structure your sections. Third, the snippet text must match the content that is visible on the page.
This is the first time Google has offered such specific guidance on this feature. For sites relying on FAQ accordions, tabbed product details, or content that loads after scrolling, this could mean fewer deep links compared to pages that render everything upfront. Slobodan Manić of No Hacks noted on LinkedIn that Google’s language frames this as a general structural preference, not just a tip for read-more links. He argues that the same structure that helps search crawlers also helps AI agents, making the audit process identical for both. The key question for existing pages is whether critical information is trapped inside a click-to-expand element. If one section already earns a deep link, replicating that structure across other sections may improve their performance too. Google describes these as practices that can “increase the likelihood” of deep links, not guarantees.
Google May Expand Robots.txt Rules Based on Real-World Data
Google’s Gary Illyes and Martin Splitt revealed on the Search Off the Record podcast that the company may add new rules to its robots.txt documentation. The team analyzed the most frequently used unsupported directives in robots.txt files across millions of URLs in the HTTP Archive. They plan to document the top 10 to 15 most common unsupported rules beyond the standard `user-agent`, `allow`, `disallow`, and `sitemap` directives. Illyes also hinted that the parser may accept more typos of “disallow” in the future, though no timeline or specific misspellings were named.
If Google formalizes these unsupported directives, site owners will have clearer guidance on what the crawler ignores. Anyone using custom or third-party robots.txt rules should audit their file now. The HTTP Archive data is publicly available via BigQuery, so you can examine the same distribution Google used. The typo tolerance is speculative, but it suggests that some misspellings are already accepted. It is safer to correct any spelling variants now rather than assume they will be ignored.
EU Proposes Google Share Search Data with Rivals and AI Chatbots
The European Commission has sent preliminary findings proposing that Google share search data with rival search engines across the EU and EEA, including AI chatbots that qualify as online search engines under the Digital Markets Act (DMA). The proposal covers four data categories: ranking, query, click, and view data. A public consultation is open until May 1, with a final decision due by July 27.
This is a pivotal moment. The proposal explicitly extends search-engine data-sharing eligibility to AI chatbots. If the eligibility survives the consultation, the regulatory definition of “search engine” would now include products that most SEO work has treated as a separate category. For sites optimizing for EU/EEA visibility, this could broaden where anonymized search signals flow. AI products competing with Google could use that data to improve their retrieval and ranking, potentially affecting which content they cite. Outside the EU, the direct regulatory effect is zero, but the category definition is likely to be cited in future proceedings. The eligibility question is the story to watch through May 1. If the Commission narrows the AI chatbot criteria, the implications stay regulatory. If it holds the line, that sets a material precedent for how AI search is classified.
Google Introduces New Task-Based Search Features
Google is continuing its shift toward task completion with two new features. Users can now track individual hotel price drops via a new toggle in Search. When prices drop, Google sends an email alert. Additionally, Google is adding the ability to launch AI agents directly from AI Mode, allowing users to initiate tasks handled by AI within the search interface.
Each task-based feature moves a process that previously started on another site into Google’s own surface. Hotel price tracking has existed at the city level for months, but expanding to individual hotels adds a new signal that users can set inside Google rather than on hotel or aggregator sites. Direct-booking visibility depends on being inside Google’s ecosystem. Sites relying on price-drop alerts as a return trigger may see some of that engagement reallocated to Google’s tracking UI. For hotel brands, this raises the stakes for ensuring individual hotel pages are fully populated in Google Business Profile and hotel feeds.
Daniel Foley Carter connected the feature to a broader pattern on LinkedIn, noting that Google’s AI overviews, AI mode, and in-frame functionality are all ways Google is consuming more traffic opportunities. The AI agent launch is more speculative. Google has not published detailed documentation explaining what tasks users can delegate or how sources get cited. The feature confirms that agentic search, described by Sundar Pichai as “search as an agent manager,” is appearing incrementally in Search rather than as a single launch.
Theme of the Week: The Rules Are Getting Written
Each story this week spells out something that was previously implicit or underway. Google signaled plans to expand what its robots.txt documentation covers. The company listed specific practices that can increase the likelihood of “Read more” deep links appearing. The European Commission proposed measures that extend search-engine data-sharing eligibility to AI chatbots under the DMA. And task-based features that Sundar Pichai described in interviews are rolling out as toggles in the search bar.
For your day-to-day work, the ground gets firmer. Fewer questions are judgment calls. What does and doesn’t qualify, what Google supports, and what counts as a search engine to a regulator are all getting written down. That works to your advantage when it means clearer audit criteria, and against you when “we weren’t sure” is no longer a defensible answer.
(Source: Search Engine Journal)




