AI & Tech Artificial Intelligence BigTech Companies Newswire Technology What's Buzzing

The Limits of LLM-Only AI Search

January 20, 2026Last Updated: January 20, 2026

5 minutes read

AI search concept with webpage, code, data, and search icon.

▼ Summary

– In 2026, content teams are creating “LLM-only” pages (like markdown files or JSON feeds) specifically for AI systems, hoping to improve citations in AI search results.
– Industry experts and data show these specialized pages rarely get cited unless they contain unique, useful information not available on the site’s regular pages.
– Google’s representatives state they do not use or support files like llms.txt, comparing the trend to obsolete SEO tactics like the keywords meta tag.
– Large-scale analysis found no correlation between implementing llms.txt files and receiving more AI citations, indicating the format itself offers no advantage.
– The recommended strategy is to focus on creating clean, well-structured HTML content for all users, as this is what both humans and AI systems effectively parse.

In the rapidly shifting world of search, a new tactic has emerged where teams create content specifically for AI systems, not human visitors. This strategy involves developing stripped-down pages in formats like markdown, JSON, or dedicated directories, with the goal of earning more citations from tools like ChatGPT and Google’s AI Overviews. The logic is that by removing ads, navigation, and complex code, the core information becomes easier for large language models to parse and reference. However, evidence suggests this approach may be misguided, as AI systems are trained on standard web pages and show no preference for these machine-only formats. The key to visibility remains creating high-quality, well-structured content that serves both people and algorithms.

The practice is certainly gaining traction. Websites in technology, software, and documentation are deploying various LLM-specific formats. The central issue isn’t whether people are trying it, but whether it delivers the promised results in AI-generated answers.

Content teams are experimenting with several methods. One is the llms.txt file, a plain text document placed at a site’s root to list important pages for AI discovery. Introduced by an AI researcher, this file includes a project name, description, and links to key content. Major companies like Stripe, Cloudflare, and Anthropic have implemented it, hoping to guide AI to their most valuable documentation.

Another tactic is creating markdown (.md) copies of existing pages. By appending `.md` to a URL, sites serve a version stripped of all styling, menus, and interactive elements, leaving only raw text. The theory is that this removes parsing barriers like CSS and JavaScript, potentially leading to more accurate citations.

Some organizations go further, building entire shadow content libraries under paths like /ai/ or /llm/. These are separate, bot-friendly versions of standard pages, sometimes containing more detailed information. If a person stumbles upon them, the experience is often akin to browsing a very basic, text-heavy site from decades past.

Finally, companies like Dell have adopted structured JSON metadata files. These feeds, which might be named `/llm-metadata.json`, provide clean product data, specs, pricing, availability, in a format easily digested by AI. This approach is logical for ecommerce and SaaS businesses that already maintain structured product databases; they are simply exposing that data in a new way.

Despite the sound theory and notable adoption, the crucial question is whether these optimized pages actually get referenced by AI. Individual testing by an industry expert involved crafting specific prompts to target content from five sites using these methods. The findings were revealing.

The llms.txt files accounted for a minuscule fraction of citations. Out of nearly 18,000 references, only six pointed to these files. The successful ones contained genuinely useful, technical information about using an API. Files stuffed with keywords for optimization received zero citations.

Markdown page copies fared even worse. While the original HTML pages from these sites were cited thousands of times, not a single citation pointed to the `.md` versions. The sole exception was GitHub, where markdown is the native format and no HTML alternative exists.

Results for /ai pages were mixed, ranging from 0.5% to 16% of citations for tested sites. The significant difference was content. The site with 16% of citations placed substantially more unique information in its AI-specific pages than existed on its regular site. Even with prompts designed to surface this content, most queries still ignored the `/ai` versions.

JSON metadata files showed modest success, with one brand seeing 5% of its citations come from such a feed. Again, the critical factor was that the JSON file contained information unavailable elsewhere on the website, and the query specifically asked for that data.

A broader analysis examined 300,000 domains to see if adopting `llms.txt` correlated with more citations at scale. Only about 10% of domains had implemented the file, far less than universal standards like `robots.txt`. Interestingly, the largest, most established sites were slightly less likely to use it than mid-tier ones. Most tellingly, a machine learning model built to predict citation frequency found that the presence of an `llms.txt` file added no predictive value; removing it from the model actually improved accuracy.

The pattern from both analyses is clear: LLM-optimized pages only get cited when they contain unique, useful information not found on the standard site. The format itself confers no advantage. As one expert concluded, you could name a file `12345.txt` and it would be cited if it held valuable, exclusive content. A well-structured standard page achieves the same result as a specially crafted one. The data shows no correlation between having these special files and receiving more AI citations.

Major AI platforms have not endorsed these methods. A Google representative has been openly skeptical, comparing LLM-only pages to the obsolete keywords meta tag, available for anyone to use but ignored by the systems they aim to influence. He noted that, to his knowledge, no AI services use `llms.txt` files, and server logs show they don’t even check for them. Another Google official explicitly stated the company does not support `llms.txt` and has no plans to do so.

Google’s official guidance is straightforward: the best practices for traditional SEO remain relevant for AI features in search. There are no special optimizations required to appear in AI Overviews. While companies like OpenAI and Anthropic maintain their own `llms.txt` files for their documentation, they have not announced that their crawlers read these files from other sites. The consistent message is that standard web publishing drives AI visibility.

For SEO and content teams, the evidence points to a definitive strategy: stop building content that only machines will see. The fundamental question remains: why would an AI system want to parse a page that no user ever visits? If AI companies needed special formats, they would publicly say so.

Instead of creating shadow versions, focus on what genuinely works. Build clean, semantic HTML that both humans and AI can easily parse. Reduce heavy JavaScript dependencies for critical content, as complex client-side rendering remains a technical barrier for many AI systems. Use officially supported structured data, like OpenAI’s specifications for ecommerce feeds, when available. Most importantly, improve your core information architecture so that key content is well-organized and discoverable.

The optimal page for AI citation is identical to the best page for users: clearly written, logically structured, and technically sound. Until AI platforms publish formal requirements stating otherwise, that is where optimization efforts should be concentrated.

(Source: Search Engine Land)

Topics

llm optimization 95% ai citations 90% google's stance 85% seo myths 85% Content Strategy 80% citation analysis 80% markdown pages 75% unique content 75% industry adoption 70% best practices 70%

The Limits of LLM-Only AI Search

Topics

Margaret Atwood: AI’s problem is ‘garbage in, garbage out’

Max Planck’s 1940s Papers: Why Were Two Retracted?

AI Therapist Reads Smartwatch, Earbuds to Detect Distress Early

Cancer-Stricken Founder Used AI to Fight Back

Man’s Brain Tumor Symptoms Were Actually Caused by Worms

Ford rehires 350 engineers to correct AI mistakes

Antibiotic megacluster discovery opens new front against superbugs

Future Marketers’ Key Insight: How Customers Decide

How Rock Weathering Creates a Climate Feedback Loop

Topics

Related Articles