Source: ahrefs-ai-mode-vs-ai-overviews-730k-2026-05-19.md — Despina Gavoyannis & Xibeijia Guan, reviewed by Ryan Law. Published December 15, 2025 on the Ahrefs Blog.

Ahrefs analyzed 730,000 AI Mode and AI Overview response pairs using September 2025 US data from their Brand Radar product. The headline finding: the two Google surfaces cite the same URLs only 13.7% of the time, yet reach semantically similar conclusions 86% of the time. The divergence reveals that AI Mode and AI Overviews are not a “short version / long version” pair drawing from a shared source pool — they are two distinct systems that independently research the same query and converge on similar conclusions through different evidence paths. For SEO practitioners, this collapses the assumption that a single citation strategy covers both surfaces.

Key Takeaways

  • Citation overlap is 13.7%. For the same query, AI Mode and AI Overviews share the same cited URLs only 13.7% of the time. Looking only at the top 3 citations from each, overlap rises slightly to 16.3%. Either way, ~87% of citations are unique per surface for the same query.
  • Word-level overlap is 16%. Jaccard similarity across unique word tokens is 0.16. The two systems produce the same first sentence just 2.51% of the time, and identical responses only 0.51% of the time.
  • Semantic similarity is 86%. Despite minimal citation and word overlap, cosine similarity averages 0.86. Nearly 90% of response pairs (89.7%) score above 0.8 on a 0-to-1 scale — meaning the two surfaces almost always agree on what to say while disagreeing on where they found it.
  • AI Mode responses are 4× longer and cite 2.5× more entities. Average entity mentions: 3.3 per AI Mode response vs. 1.3 per AI Overview response. A query for “cloud storage alternatives” yields 7 brand mentions in the AI Overview and 23 in AI Mode.
  • If your brand appears in AI Overviews, there is a 61% chance it will also appear in AI Mode — but alongside additional competitors that didn’t make the AI Overview cut. AI Mode appears to build on AI Overview’s entity foundation and then expand it.^[inferred]
  • 59.4% of AI Overview responses contain no brand or entity mentions vs. 34.7% for AI Mode. About one-third (32.8%) of all responses across both surfaces mention no brand or person at all — these are purely informational queries.
  • AI Mode is more reliable for attribution. Only 3% of AI Mode responses lack citations entirely, vs. 11% for AI Overviews. The difference is likely driven by AI Mode’s longer format and interactive-research framing, which sets a higher user expectation for source transparency.^[inferred]
  • Domain preferences diverge. YouTube held the top position in AI Overviews (cited more than encyclopedic sources); Wikipedia appeared in 10% more AI Mode citations than AI Overview citations; Quora appeared 3.5× more in AI Mode; health websites were cited nearly 2× more in AI Mode; Facebook was cited 2× more in AI Mode. AI Mode leans encyclopedic and medical; AI Overviews lean video and community-driven.

Study Design

  • Sample sizes. 540,000 query pairs for citation and URL analysis; 730,000 query pairs for content similarity analysis. Data: September 2025 US SERP captures from Ahrefs Brand Radar.
  • Citation overlap method. For each query, Ahrefs captured both an AI Mode response and an AI Overview response, then counted URLs appearing in both. Reported at the all-citations level (13.7%) and the top-3-citations-per-surface level (16.3%).
  • Semantic similarity method. Cosine similarity on response embeddings, scaled 0 (unrelated) to 1 (identical meaning). Threshold of 0.8 used to define “strong alignment.”
  • Word-level similarity method. Jaccard similarity — unique words in common divided by all unique words across both responses.
  • Entity overlap method. Named-entity recognition on people, organizations, and brands mentioned in each response. The 61% containment finding (AI Mode includes all of AI Overview’s entities in 61% of cases) is the entity-inclusion rate, not a reciprocal overlap.
  • Key limitation — single-generation snapshot. Each query’s responses were captured once. Ahrefs’s own prior research found that 45% of AI Overview citations change between regenerations of the same query. A single snapshot underestimates the potential citation pool available to each system; the 13.7% overlap figure could be lower than the long-run “reachable overlap” if both surfaces are drawing from a larger shared pool that they sample differently on each call.^[inferred]
  • Scope. US-only, English-language, September 2025. AI Mode was in limited availability at that time; the dataset composition by query type is not broken out.^[ambiguous]

Where This Lands in the AI-SEO Cluster

This study provides the cleanest within-Google engine-divergence data in the cluster. Every other cluster article treats Google’s AI surfaces either as a unified system (Google’s own optimization guide) or measures one surface at a time (AI Overviews in the Ahrefs 38% top-10 study; AI Mode self-citation in the SE Ranking study). This is the first study that holds the query constant and directly compares what the two surfaces do differently.

The 13.7% citation overlap figure is the load-bearing number. It means that ~87% of the time, AI Mode and AI Overviews pull from completely different sources to answer an identical query. That is not a minor calibration difference — it is near-independent sourcing. The semantic convergence at 86% explains why: both surfaces apparently agree on the correct answer but find that answer through different retrieval paths. Google’s own documentation confirms they use different models and techniques, with both using query fan-out — meaning they expand the original query into sub-queries, but they do not necessarily expand into the same sub-queries.^[inferred]

Positioning within the cluster:

  • Causal tier (Ahrefs schema DiD study): Measures whether adding schema causes AI Overview citation lift — null result. Orthogonal methodology from this study.
  • Correlational tier (AirOps, Digital Applied, GEO-16): Measures what on-page and off-page signals correlate with citations — primarily in ChatGPT and AIO. This study adds within-Google surface comparison to that tier.
  • Meta-analysis tier (Zyppy 23-factor): Synthesizes 54 studies. This study is one of the highest-quality single-study inputs to a future Zyppy update because of its scale (730K pairs) and its direct comparison methodology.
  • Official guidance (Google): Google’s optimization guide treats AIO and AI Mode as variants of the same AI search surface. This study empirically tests that assumption — and finds that from a citation-sourcing perspective, they behave as largely independent surfaces.

Cluster thesis impact: The cluster’s 10-step AI SEO playbook in the AI SEO hub currently treats “optimize for AI Overviews” and “optimize for AI Mode” as near-equivalent plays. This study argues they should be explicitly separated past a baseline level. The engine-agnostic signals (topical authority, structured data as a maturity marker, retrieval rank in sub-query SERPs) still provide lift across both surfaces — but beyond that baseline, the two surfaces appear to draw on different domain preferences and different source pools. For practitioners with dedicated budgets, this justifies bifurcated tracking and bifurcated optimization experiments.

Pairing with SE Ranking’s Google self-citation finding: The SE Ranking study found that Google cites its own properties heavily in AI Mode responses. The Ahrefs study’s domain-preference data corroborates this — Wikipedia (a Google-adjacent encyclopedic source), YouTube (Google-owned), and Facebook all appear at elevated rates in AI Mode vs. AI Overviews. The two studies together sketch a coherent picture of AI Mode as a more encyclopedic, Google-ecosystem-heavy surface vs. AI Overviews as more community- and video-driven.^[inferred]

Open Questions

  • Is the 13.7% overlap a stable property of the two systems, or a September 2025 snapshot that shifts as AI Mode matures? AI Mode was in limited availability in September 2025. As it rolls out more broadly and its training data updates, the citation pool and overlap rate could shift substantially. The study needs a replication at full AI Mode rollout.^[inferred]
  • Does the 86% semantic similarity mean there is no practical difference in the answers users receive, even though the sources differ? If true, the citation divergence matters for brand visibility tracking but not for user experience. The study measures convergence at the response level but does not measure whether cited sources influenced the response content — i.e., whether the different sources actually produce different answers.^[inferred]
  • Do the domain-preference divergences (YouTube in AIOs, Wikipedia in AI Mode, Quora 3.5× in AI Mode) hold across query intent classes? The study does not break down by informational vs. commercial vs. navigational intent, which likely drives much of the domain-type preference.^[ambiguous]
  • What is the reachable overlap if both surfaces regenerate the same query multiple times? Given that 45% of AI Overview citations change between regenerations, the single-snapshot methodology likely understates the potential shared citation pool. A multi-generation study would bound the “true” overlap range.

Try It

  1. Set up bifurcated citation tracking. If you’re currently monitoring only one Google AI surface, add the other. Use Ahrefs Brand Radar, BrightEdge, or Semrush’s AI visibility reporting to pull citation data per surface separately. A single combined “Google AI” metric will mask divergence between the two surfaces — the 13.7% overlap means they are behaving as largely independent channels.
  2. Audit your domain-preference alignment per surface. AI Mode leans encyclopedic (Wikipedia, health sites) and Google-ecosystem (YouTube, Facebook); AI Overviews lean community-driven (Reddit, YouTube at the top position) and video. If you are primarily investing in community presence and video transcripts, your content is better-aligned for AI Overviews. Encyclopedic, deeply sourced reference content targets AI Mode more directly. Check which surface drives more impressions for your brand before deciding where to concentrate effort.
  3. Treat AI Mode’s 61% entity-inclusion rate as a floor, not a ceiling. If your brand is appearing in AI Overviews, the 61% figure means there is roughly a 1-in-3 chance you are absent from the longer AI Mode response for the same query. Identify queries where you appear in AI Overviews but not AI Mode — those gaps are the highest-priority AI Mode optimization targets because the underlying topical authority is already established.
  4. Prioritize attribution infrastructure for AI Mode. At a 97% citation rate vs. 88% for AI Overviews, AI Mode is the more reliable surface for source-level brand tracking. If your content does appear in AI Mode, it will almost always include a citation link — making AI Mode impressions more measurable and more actionable for link-based attribution models.
  5. Do not conflate a single AI-SEO playbook across engines. The engine-agnostic baseline (topical authority, structured data, retrieval rank) still applies to both surfaces. But beyond that baseline, optimize for each surface’s domain preferences separately and track results independently. A strategy that improves AI Overview citation rates will not automatically transfer to AI Mode at the same rate — the citation pools are ~87% non-overlapping.