AI & TechArtificial IntelligenceBigTech CompaniesDigital MarketingDigital PublishingNewswireTechnology

Microsoft Shows Search Innovation with Latest Move

▼ Summary

– Bing Webmaster Tools now has two separate reports: Search Performance for traditional ranking and AI Performance for AI citations, confirming that ranking a page and citing a passage are different tasks.
– Web IQ is a suite of grounding APIs re-architected for AI agents, returning passage-level evidence instead of ranked documents, and it scores passages on completeness, freshness, and authority (GDSAT).
– First-party reporting, like Microsoft’s Citation Share, provides detailed data on a platform’s own surfaces but is structurally limited to that platform’s view, not the entire field.
– Content must now be built for passage extraction, with each section self-contained and front-loaded with claims, as AI systems score and act on individual sections independently of page ranking.
– The gap between where a page ranks and where it is cited is a key signal to track; this delta should be measured across multiple engines using consistent third-party methods, not a single dashboard.

The company that owns one of the web’s largest indexes has quietly confirmed, through its own product, that ranking a page and citing a passage are fundamentally different tasks. The arrival of Web IQ at Build on June 2, alongside the earlier debut of AI citation data inside Bing Webmaster Tools, was hardly unexpected. Having built the modern version of Webmaster Tools myself, I knew the reporting would inevitably follow the answer once answers were being generated rather than ranked. A tool that only ever described blue links was always going to feel incomplete. The real question was never whether the citation signal would surface. It was what a platform’s own reporting would, and would not, be able to show you once it did.

The Split Stopped Being A Thesis And Became A Navigation Menu

For a couple of years, the argument that human search and AI answering are two distinct problems relied solely on logic. That argument no longer needs to be carried that way, because an index owner has embedded it directly into a product. Bing Webmaster Tools now features two separate reports. Search Performance remains, reporting clicks, impressions, click-through rate, and average position as it always has. Beside it sits AI Performance, which Microsoft launched in public preview in February, extended in March with grounding-query-to-page mapping, and expanded again in June with a Citation Share metric (still in preview) that reports your share of citations for a given grounding query alongside raw counts. Microsoft framed that second report as separating AI citations from traditional search, and this month’s additions as expanding first-party reporting. Two reports, two definitions of success, one tool.

Web IQ is the infrastructure that gives the second report real meaning. It is a suite of grounding APIs built on the Bing index but re-architected end-to-end to serve agents rather than people. Instead of returning ranked documents, it returns passage-level evidence objects and structured context. Jordi Ribas, who runs Search and AI at Microsoft, put the framing plainly: Bing was built for humans, and the next era of search is for agents. Microsoft cites estimates suggesting agents may generate roughly a thousand times more queries than all human search combined within a few years. Microsoft says Web IQ already powers grounding in both Copilot and ChatGPT, that it is model-agnostic and speaks MCP natively, and that it runs at a 164-millisecond P95 latency it describes as roughly two and a half times faster than the nearest alternative. It scores retrieved passages on a metric called GDSAT (grounding satisfaction), evaluating completeness, freshness, and authority. Take those performance figures as vendor claims and verify them against your own use if they ever influence a decision.

The line beneath all of them is the most important one, and it comes directly from the index owner: What makes a page rank well for a human is not the same as what makes a passage useful to an AI. That is the rank-to-citation delta, stated by the company that owns the index. Web IQ itself remains in limited access by invitation, with no general availability or published pricing, so the product signals direction more than something most teams can build on today. The direction is the point. And part of that point is that Bing Webmaster Tools is leading the way again.

The Question The Second Dashboard Quietly Raises Is Who Is Doing The Telling

Here is the part you should pay attention to, because it is not the part already in circulation. A platform reporting on its own surfaces is doing something useful, and Microsoft’s version of it is the best publisher-facing citation reporting that exists right now. It reports on Microsoft’s house, across Copilot, Bing’s AI answers, and a set of partner integrations, using Microsoft’s definitions of what counts and Microsoft’s decisions about what to surface. And as everyone already knows, a platform only ever tells you about its own house. Google is now doing its own version of this, rolling out a separate Generative AI report in Search Console that shows impressions and the pages surfaced in its AI features, though not clicks, and only for its own answer surfaces. The bottom line is each company is pointing in a different direction through their actions, but each remains aligned with their own plans, goals, and objectives, as expected.

What does not get said clearly enough is that this is not a coverage gap that closes with the next release. Watch what shipped this month at Bing. Citation Share, Intents, Topics, and Compare all make the first-party view richer and more useful, and every one of them makes it richer about Microsoft’s own surfaces. The instrument is improving, but the boundary it reports inside is not moving. That is the tell that the distance between first-party revelation and third-party collection is structural rather than temporary. The two are different instruments doing different jobs. First-party reporting gives you what one platform saw on its own surfaces, at a fidelity no outsider can match, bounded by that platform’s incentives about what to reveal. Third-party measurement gives you what the field looks like from outside, across many answer surfaces, held to one method so the numbers compare to each other. Neither is the only truth.

They answer different questions, for the same reason the recent argument held that you cannot lay rank and citation side by side and read them as one measurement. The conflation to avoid this week sits one level up: Do not mistake the view from inside one platform for the view of the field. Most won’t, but some still do. And if you wait thinking a single unified view will emerge, it won’t. That’s our past, not our future.

So The Work Changes Shape

Start with how your content is built to be read, because passages (chunks) are the unit now, not pages. Web IQ returns passages and scores them on completeness, freshness, and authority independent of where the page sits in any ranking. This means a page can rank perfectly well and still have its individual sections passed over because they do not stand on their own or because they add nothing a fresher source already supplies. Read your most important pages the way a passage selector would, one section at a time, and ask of each one whether it survives being lifted out of the page and dropped into an answer with no surrounding context. A paragraph that opens with “this approach” or “as noted above” fails that test, because the referent is gone the moment the passage is extracted. A section that front-loads its claim and then supports it survives.

The work is unglamorous: make each section self-contained, name the entity rather than leaning on a pronoun three sentences after you last used it, put the answer near the top of the block rather than burying it under throat-clearing, and make sure the passage carries something a dozen other sources have not already said better. None of that is new advice in spirit. What is new is that a system is now scoring it directly, section by section, and acting on the score.

To make those three GDSAT dimensions usable rather than abstract, define them the way the grounding system treats them.

Completeness asks whether a passage carries enough of the relevant information to support a claim on its own, rather than gesturing at an answer the reader has to assemble from elsewhere.

Freshness asks whether the information is current enough to be trusted at the moment the answer is generated. This matters more in grounding than in ranking, because a stale page ranks lower while a stale fact grounds wrong.

Authority asks whether the source is credible and attributable enough that an AI system can responsibly stand behind what it lifts.

A page can clear all three for search and still fail one for grounding, which is exactly why the two scores diverge. Microsoft’s own breakdown of how evidence quality gets judged, in its evolving role of the index post, is the clearest public reference if you want to go deeper.

Then look at access, and look carefully, because this is the part that hides in familiar territory. Web IQ inherits Bing’s existing robots.txt compliance and publisher controls and introduces no new crawler user-agent. This means your current BingBot configuration is what governs whether Web IQ can reach you at all. A directive you set years ago to protect crawl budget, or to keep a section out of the index for reasons that made sense at the time, may now be quietly deciding your eligibility to be used as evidence inside an answer. Pull your logs, confirm what BingBot is actually fetching against what you intend to expose, and separate the three questions you are really asking: whether you are being crawled, whether you are being indexed, and whether you are being grounded. They are not the same question, and a page can pass the first two and still be invisible to the third. The teams that get caught out here are usually the ones who assume access settings are solved because they were solved for search.

Harder, now that the first-party data is this good, is the discipline of not reading your AI visibility off a single screen. Citation Share gives you a clean relative number for how often Microsoft’s surfaces cite you against everyone else cited for the same grounding query. A clean number is exactly the kind of thing that gets pasted into a deck and treated as the answer. Resist that. The number is real, and it is Microsoft’s. It says nothing about whether ChatGPT reached for you on the same question, whether Gemini did, or whether Perplexity did. The fact that Web IQ infrastructure sits under ChatGPT’s grounding does not mean Microsoft’s dashboard reports ChatGPT’s citations back to you. It doesn’t. Treat any single first-party dashboard as one instrument on the bench rather than the readout, and check presence across more than one engine before you draw a conclusion about how you are doing. The surface that hands you the cleanest number is not the only surface forming answers about you.

The last move ties the first three to something you can manage over time. Treat the distance between where you rank and where you get cited as a number you track on purpose, not an assumption you carry forward from quarter to quarter. The delta is the signal. When rank and citation move together, the page-level work is carrying into the answer, and your instincts about that page are sound. When they pull apart, something about how your content is being read as evidence has shifted, and you want to catch that as a measured change rather than reconstruct it after the traffic has already moved and someone is asking why. Run the comparison on a regular cadence, on the queries that actually matter to the business rather than the full vanity list, and watch the direction of travel more than any single reading. A gap that is widening tells you more than a gap that is merely large.

None of these points to a particular product. It points to a habit. First-party reporting will keep getting better, and you should use every bit of it, because no outsider sees a platform’s own surfaces as clearly as the platform does. But the reading that spans surfaces, holds to one consistent method, and owes nothing to one platform’s incentives is the one that tells you where you actually stand across the field rather than inside one house. Third-party measurement has quietly become robust enough to carry that weight, and learning to read it well is worth more than any single number on any single screen. The index owner just confirmed, from the inside, that ranking and citation are different things; that traditional search and AI responses have different units of value. Acting on that confirmation means measuring the gap from the outside, where the view is wider.

If you have run that comparison between where you rank and where you get cited and watched the two refuse to line up, I would like to know what you are seeing in your own data. Leave a comment, or reach out directly.

Much of how I think about staying visible, trusted, and chosen as this layer hardens into infrastructure is in my book, The Machine Layer.

(Source: Search Engine Journal)

Topics

rank vs citation 98% bing webmaster tools 95% web iq infrastructure 93% ai citation reporting 90% passage-level content 88% gdsat scoring 85% first vs third-party data 82% publisher controls 80% agent-generated queries 78% content optimization 75%