ChatGPT Search Tends to Default to English in Complex Queries: Report

▼ Summary
– A new report finds that ChatGPT Search frequently generates background queries in English, even when responding to non-English prompts, with 43% of these “fan-out” queries being in English across the analyzed data.
– The study, which filtered for prompts where the user’s language matched their IP location, found 78% of non-English prompt runs included at least one English-language fan-out query, with Turkish prompts highest (94%) and Spanish lowest (66%).
– Practical examples showed this pattern can lead to responses favoring global, English-language sources over relevant local ones, such as omitting Poland’s Allegro.pl when asked about auction portals from Poland.
– This language bias in query generation may disadvantage non-English SEO and content, as it filters which sources are considered before traditional citation ranking signals are applied.
– The report’s methodology relies on automated platform data, and it remains unclear if this English-language query pattern is an intentional design choice or an emergent behavior of the ChatGPT Search system.
A new analysis of ChatGPT’s search functionality reveals a significant tendency for the system to generate background queries in English, even when responding to prompts written in other languages. This behavior could have major implications for the visibility of local content and brands in non-English markets. The findings come from a report by AI search analytics firm Peec AI, which examined over 10 million prompts and the 20 million subsequent “fan-out” queries they triggered.
When a user asks a question, ChatGPT Search often rewrites the original query into several more specific sub-queries, which it then sends to its search partners to gather information. Peec AI refers to these rewritten sub-queries as “fan-outs.” The company’s research focused on the language used in these automated searches. Across all non-English prompts analyzed, a substantial 43% of the fan-out steps were conducted in English. To refine its data, Peec AI filtered for cases where the user’s IP address location matched the language of their prompt, excluding mixed signals like a German query from a UK IP. Even with this filter, the data showed that 78% of non-English prompt runs included at least one English-language fan-out query.
The prevalence of English background searches varied by language but was consistently high. Turkish-language prompts most frequently triggered English fan-outs, at a rate of 94%. Spanish prompts were the lowest among those studied, yet still reached 66%. The pattern observed suggests ChatGPT typically begins its information gathering in the prompt’s language but then incorporates English queries as it builds a more comprehensive response.
Practical examples from the report illustrate how this pattern can skew results. A Polish-language query from a Polish IP address about the best auction portals resulted in a response that omitted or buried Allegro.pl, Poland’s dominant ecommerce platform, in favor of global sites like eBay. Similarly, a German prompt about German software companies yielded a list with no German firms, and a Spanish query about cosmetics brands failed to surface any Spanish brands. In the Spanish example, Peec AI showed that ChatGPT’s first fan-out query ran in English. Its second query was in Spanish but added the word “globales,” a qualifier not present in the original user request.
For SEO and content teams operating in non-English markets, this presents a potential disadvantage. If the system’s background searches heavily favor English, it may primarily consider English-language sources, which could favor global brands over local competitors. This language-based filtering happens before traditional citation signals even come into play, potentially limiting the pool of sources ChatGPT evaluates.
The methodology behind the report involves data collected via browser automation, running customer-defined prompts daily through the web interfaces of AI platforms. The 10 million prompts analyzed originated from Peec AI’s own platform, not from a broad panel of consumer sessions. The report’s author, Tomek Rudzki, is a recognized technical SEO practitioner.
OpenAI’s public documentation details the query-rewriting process but does not specify how languages are selected for these background searches. It remains unclear whether the strong tilt toward English is a deliberate design feature or an unintended consequence of the system’s training. This development raises important questions for the future: will optimizing English-language content become a necessary part of AI search strategy, or will the platforms themselves adapt to better source and represent local market information?
(Source: Search Engine Journal)





