AI & TechArtificial IntelligenceBigTech CompaniesDigital MarketingNewswireTechnology

Google LLM Patent: New SEO Goal Is Teaching AI Your Identity

▼ Summary

– A 2023 Google patent describes an AI system that builds a “deep, holistic characterization” of entities like businesses and products by extracting and interpreting information from websites and public data.
– The system uses large language models to interpret content, identify attributes (e.g., reputation, services), and organize them into hierarchical graphs, going beyond simple fact extraction.
– The patent details a process of collecting data from a domain and third-party sources (e.g., maps, job listings) to generate entity summaries and models that describe an entity’s identity, positioning, and relationships.
– For SEO, this suggests a shift from optimizing individual webpages to helping Google understand the entity behind the content, as AI search systems need entity models to make recommendations.
– To influence entity understanding, businesses should maintain consistent information across sources, define desired attributes, support claims with evidence, and clarify relationships between products, services, and audiences.

A recent Google patent from 2023 outlines how AI systems could construct a deep, holistic understanding of businesses, brands, products, and other entities by analyzing data from websites and public sources. The filing details a process for extracting information, identifying relationships, and synthesizing this into what Google calls a “deep, holistic characterization” of an entity. If systems like this become more central to search, SEO may need to shift from optimizing for documents to helping Google understand the entity behind your content.

The Shift from Documents to Entities

For over two decades, Google has focused on helping users find information on webpages. Whether through traditional results, featured snippets, or AI-generated answers, the starting point has always been understanding documents. However, as Google’s search products evolve to be more conversational and recommendation-driven, understanding individual documents may no longer be sufficient. Before an AI can recommend a business, compare products, or explain a brand, it must first understand the entity behind the content. This is precisely what makes Google’s “Data extraction using LLMs” patent so intriguing.

At first glance, the patent might seem like just another content extraction system. Search engines have been pulling data from webpages for years. But Google describes a much broader goal. According to the filing, the techniques enable AI to “generate and enhance a deep, holistic characterization of a particular entity.” Google defines an entity broadly, encompassing people, companies, businesses, places, objects, and concepts. The system is designed not just to identify facts or index content, but to interpret information, identify relationships, generate summaries, and develop a true understanding of the entity.

How the Patent Creates an Understanding of an Entity

The patent describes a system that collects information from websites and public sources, processes it with an AI system, and synthesizes an understanding of an entity. Here’s a simplified breakdown of the process.

Step 1: Identify the Entity The process starts by identifying a domain and its associated entity. The system then gathers information from webpages linked to that domain and processes it using an AI system that includes a large language model (LLM).

Step 2: Interpret the Information Instead of just extracting facts, the system generates what the patent calls a “characterization” of the entity. Google explains this characterization is “an interpretation of the extracted first content and extracted second content rather than a verbatim duplication.” The system goes beyond collecting information; it interprets it and forms conclusions about the entity.

Step 3: Extract Attributes and Relationships The AI can analyze webpages to extract information like an entity’s presence, age, principles, services, reputation, social media sentiment, and the relationships between different organizational elements. These signals help the system move from understanding individual webpages to understanding the entity itself.

Step 4: Supplement with Third-Party Information Crucially, the patent isn’t limited to a company’s own website. Google notes that AI systems may use online maps data, job listings, business information, or other third-party data to provide additional context. The goal is to build a more complete understanding of the entity than could be obtained from a single source.

How the Patent Represents Entities

The system organizes information about an entity into a format that other systems can interpret, expand, and use.

Entity Summaries After collecting information, the patent describes generating an entity summary. These aren’t page summaries; they read more like descriptions of a company’s identity, positioning, and values. One example describes a hypothetical company’s brand identity, noting associations with simplicity, accessibility, trust, innovation, and social responsibility. Another example presents these same concepts as a set of key attributes like trustworthiness, innovation, accessibility, and social responsibility. The key is the format: the system takes information from multiple sources, transforms it into an interpretation, and synthesizes it into a higher-level understanding.

Entity Graphs Google builds this understanding through hierarchical graph structures. The patent describes a “hierarchical graph structure that includes at least one parent node representing a first attribute of the characterization and at least one leaf node representing a second attribute.” The figures show examples for both service-based and product-based companies, organizing information into connected relationships rather than isolated facts. For a service, the system associates it with audiences, locations, reputation signals, and differentiators. For a product, it connects features, categories, use cases, and related offerings.

Entity Models The patent begins to resemble an entity modeling system more than a content extraction system. Extracting information answers the question: What information appears on this website? Entity modeling answers a different question: What do we understand about this business? The system analyzes information related to an entity’s presence, age, principles, services, reputation, social media sentiment, and relationships, incorporating data from external sources like maps and reviews. The result is a model that can answer broader questions about an organization, developing a contextual understanding of who the entity is, what it does, how it’s perceived, and how it relates to others.

Understanding Information Regardless of Format

Google has long built systems to help machines understand web information through structured data, schema markup, and knowledge graphs. The patent emphasizes the ability to extract information that wasn’t specifically structured for machine consumption. The AI can process content that has “not been structured for parsing” and from webpages not organized for traditional extraction systems. This is a primary advantage: the system can extract and interpret information “irrespective of its format,” generating new content that synthesizes what it finds. The patent suggests Google is exploring ways to build a more complete entity understanding, not limited to a company’s own website and supplemented by maps, business information, and job listings. The website remains vital, but it becomes one of several inputs used to construct an understanding of the entity behind it. As AI-powered search focuses on answering questions and making recommendations, the quality of those outputs depends on the quality of the system’s understanding.

From Webpages to Entities: What This Means for SEO

Patents don’t tell us exactly how Google will use a technology, but they reveal how Google is thinking about a problem. In this case, the problem is understanding entities. This isn’t new; Google’s Knowledge Graph and emphasis on E-E-A-T, product reviews, and reputation signals have reflected a similar objective. What makes this patent worth examining is the role large language models now play. The patent describes an AI that can analyze websites and public information, interpret it, and synthesize an understanding of an entity without requiring a specific format. This becomes increasingly important as search moves beyond document retrieval. For a system like AI Overviews to answer a question about a company, it must first determine what that entity is, what it offers, and whether it is relevant.

Webpages Become Evidence Through an SEO lens, this suggests a change in how webpages function. Traditionally, pages are optimized to rank for queries. But if systems like this become more influential, webpages may increasingly serve a second purpose: they become evidence used to construct an understanding of the entity. A service page helps establish what services a business offers. A case study demonstrates experience. A team page identifies the people behind the organization. Customer reviews contribute reputation. Press coverage provides additional signals. The patent emphasizes combining information from multiple sources to create a more complete picture.

Visibility May Depend on Entity Understanding Visibility may increasingly depend on how effectively Google understands the entity associated with keywords. This is especially important when AI systems are summarizing options or making recommendations on behalf of a user. The quality of the system’s understanding becomes a critical factor in determining which entities are surfaced and how they are described. The challenge for SEO may no longer be limited to helping Google understand a page; it may increasingly involve helping Google understand who you are.

How Brands Can Influence Entity Understanding

If Google’s goal is to synthesize an understanding of a business from its website and other public sources, the practical question is what organizations can do to shape that understanding. The patent suggests entity understanding emerges from the accumulation and interpretation of information across multiple sources.

Maintain Consistency Across Sources Because the characterization is an interpretation of content from multiple sources, consistency becomes crucial. Review how your business is described across your website, business profiles, social media accounts, press coverage, job postings, and industry directories. The goal isn’t identical wording everywhere, but ensuring AI systems encounter a consistent understanding of who you are and what you do.

Define the Attributes You Want Associated with Your Brand The patent’s example summaries focus on characteristics like trustworthiness, innovation, and accessibility. Ask yourself what you want to be known for, what differentiates you, and what attributes should be associated with your brand. The clearer these differentiators are communicated, the easier they become for AI systems to identify.

Support Claims with Evidence The patent describes building an understanding from multiple sources, meaning claims alone may carry less weight than evidence that reinforces them. Examples include customer reviews, case studies, press coverage, industry citations, awards, and author profiles. The goal is providing evidence that supports the attributes you want associated with your entity.

Strengthen Entity Relationships The patent uses hierarchical graphs to organize relationships between different attributes and concepts. Make it easy for search engines and AI systems to understand relationships between products and services, locations and service areas, audiences and use cases, and brands and people. The easier these relationships are to identify, the easier it becomes for AI systems to understand where an entity fits and when it should be recommended. A useful exercise is to ask: If an AI system had to describe our company using information from our website, reviews, profiles, and third-party mentions, what would it say?

What This Means for Enterprise, Ecommerce, and Local Businesses

The patent’s broad definition of entity suggests the framework could be applied across many search experiences and industries.

Enterprise and B2B Organizations These organizations often face a consistency challenge, with information distributed across product pages, investor relations, press releases, and social media. If AI systems are synthesizing an understanding from multiple sources, consider whether your positioning is consistent across channels and whether your core differentiators are clearly communicated. Maintaining a coherent entity identity may become as important as maintaining a consistent brand identity.

Ecommerce and Product-Focused Businesses The patent’s product examples suggest entity understanding may extend to individual products. Users often ask questions requiring evaluation, not just retrieval. For ecommerce brands, ensure product attributes are clearly defined, category and product relationships are easy to understand, and reviews reinforce product strengths and use cases. Product information architecture and supporting content may all contribute to how products are understood in AI-driven experiences.

Local Businesses Local businesses often face a reputational and specialization challenge. Many attributes referenced in the patent align with local search signals like services, reputation, and business information. Ensure your expertise is clearly communicated, reviews reinforce the services you want to be known for, and your website and Google Business Profile tell the same story. A local business is an entity associated with specific services, locations, and reputation signals gathered from across the web.

The Next Evolution of Entity Understanding

Patents aren’t product announcements. The most useful way to view this patent is as a window into how Google is approaching the challenge of understanding entities in the age of LLMs. Throughout the filing, Google returns to the same objective: using AI to collect information from websites and public sources, interpret it, and synthesize an understanding of an entity. This aligns closely with the direction of Google’s newer search experiences like AI Overviews, AI Mode, and Ask Maps, which all depend on understanding the businesses, products, and concepts they reference. For SEOs, this may be the most important takeaway. Historically, SEO has focused on helping Google understand webpages. Patents like this suggest the next challenge is helping Google understand the entity behind them, as that understanding may influence who gets surfaced, cited, and ultimately chosen.

(Source: Search Engine Land)

Topics

entity understanding 98% google patent 95% seo implications 92% large language models 90% data extraction 88% entity modeling 86% ai recommendations 84% brand identity 82% consistency across sources 80% third-party data 78%