AI SEO is the discipline of getting your pages cited by AI answer engines — Google AI Overviews, Google AI Mode, ChatGPT, Perplexity, Brave, Gemini — alongside (and often instead of) classical organic ranking. The 2025-26 wave of empirical research is rapidly establishing what actually works, what doesn’t, and — increasingly — how the answer differs from one engine to the next.
This hub is the entry point to the wiki’s AI-citation research thesis cluster. Most studies live in seo-content; two macro-context studies live in ai-industry-research. This page surfaces them all in one place, organized by what they prove and how.
The thesis
Google’s own position settles the framing question: AI search optimization is still SEO (see Google’s Official Generative AI Search Optimization Guide). AI Overviews and AI Mode pull from the same Search index via RAG + query fan-out — to be AI-eligible, a page must first be indexed and snippet-eligible. The studies below test what else moves the needle past that baseline — and reveal that the baseline itself is shifting under model upgrades (Gemini 3) and diverging hard across engines.
The research cluster (14 studies, six evidence types)
The cluster splits along methodology, and the split matters for interpretation.
Causal evidence (1 study)
- Ahrefs Schema → AI Citations Matched DiD Study (Linehan & Guan, 2026-05-11) — only causal study in the cluster. 1,885 pages adding JSON-LD schema vs 4,000 matched controls, Aug 2025-Mar 2026, three surfaces. Adding schema produced no statistically meaningful citation lift on AIO, AI Mode, or ChatGPT.
Citation geometry — longitudinal (1 study)
- Ahrefs 38% AI Overview Top-10 Update (Linehan & Guan, 2026-03-02) — re-run of the July 2025 “76% from top 10” study. 863K keyword SERPs / 4M AIO URLs. Top-10 overlap fell from 76% to 38% in seven months, attributed to the Gemini 3 rollout (Jan 27, 2026) and expanded query fan-out. ~Two-thirds of AIO citations now come from outside the query’s own top-10 — fan-out coverage moved from optional to load-bearing.
Correlational evidence (4 studies)
- AirOps + Kevin Indig Fan-Out Effect ChatGPT Study (2026-04-13, refreshed 2026-05-19) — largest single-engine dataset. 16,851 queries / 50,553 ChatGPT responses / 353,799 pages. Retrieval rank dominates 4.1× (rank-1 cited 58.4% vs rank-10 14.2%). 88.6% of queries trigger exactly 2 fan-out sub-queries. The “From Retrieved to Cited” commercial-content companion adds: comparison pages with 3 tables +25.7%, validation pages with 8 list sections +26.9%, 5-7 stats +20%, ≤10-word sentences +18.8%.
- Digital Applied 1,000 AIO Citation Pattern Study (2026-04-26) — AIO-only with same-query controls. Top 1% of domains capture 47% of citations. Schema 2.3× lift after DA control; HowTo 2.8×. DA Pearson +0.61 on AIO (engine-divergent from AirOps’s ChatGPT-null).
- SE Ranking — 50+ AI Mode Ranking Factors (2025-12-15) — primary research scoring 50+ candidate factors for AI Mode citation, top 20 ranked. Global domain traffic is ~3× more predictive than content-quality factors — the AI-Mode-specific counterpart to Zyppy’s cross-engine meta-analysis.
- GEO-16 Framework (Kumar & Palkhouski, arXiv 2509.10762v1) (2025-09-13) — first academic AEO/GEO study. 1,100 URLs / 1,702 citations / 70 prompts across Brave + AIO + Perplexity. GEO ≥0.70 + ≥12 of 16 pillar hits → 78% cross-engine citation. Structured Data r=0.63 (p<0.001).
Engine-specific evidence (3 studies)
The newest cut in the cluster: studies that isolate one AI surface and show the same signal can behave differently across engines — and even across Google’s own two surfaces.
- Ahrefs — AI Mode vs AI Overviews (730K responses) (Q1 2026) — the cleanest within-Google divergence data. AI Mode and AI Overviews cite the same URLs only 13.7% of the time for the same query (16.3% for top-3). 87% of the time the two Google surfaces pull from entirely different sources. YouTube tops AIO; Wikipedia/Quora/Facebook over-index in AI Mode.
- SE Ranking — Google Self-Citation in AI Mode (1.3M citations) (2026-03-06) — 68,313 keywords / 1,321,398 citations / 20 niches. Google.com is 17.42% of all AI Mode citations — more than YouTube, Facebook, Reddit, Amazon, Indeed, and Zillow combined. Tripled from 5.7% in nine months; composition shifted from 97.9% Google Business Profiles (Jun 2025) to 59% organic Google SERPs (Feb 2026).
- SISTRIX AI Citation Drift (2026-05-01) — 82,619 prompts / 1,548,213 snapshots / 6 countries / 3 platforms / 17 weeks. “Fixed core + carousel” pattern: 86% of prompts hold a stable core of 1-5 domains; the rest rotate. AIO 56% citation rotation/week; ChatGPT 74%. Reframes the GEO question from “am I cited?” to “am I in the core or the rotating set?”
Meta-analysis (1 study)
- Zyppy AI Citation Ranking Factors Meta-Analysis (Cyrus Shepard, 2026-05-07) — synthesis of 54 studies into a 23-factor ranking. Top tier (9+): URL Accessibility, Search Rank, Fan-out Rank, Preview Control, Query-Answer Match, Intent-Format Match. Structured Data 5.6 (#20). LLMs.txt 2.0 (#23).
User behavior + market share (4 studies)
The macro layer beneath the citation-tactics layer — how much AI search actually matters today, who the audience is, and how they feel.
- Datos + SparkToro State of Search Q1 2026 (2026-04-27) — clickstream panel, millions of US/EU/UK desktop users, 12 months. The reality check: AI tools are <2% of total desktop visits; US zero-click fell 24.5%→22.4%; organic click share rose to 44.9%. Tempers any “AI search is already the channel” framing.
- Similarweb 2026 Generative AI Brand Visibility Index (Jan 2026) — brand-mention share across ChatGPT/Gemini/Copilot/Perplexity across 6 sectors. AI referral traffic is plateauing even as platform usage grew +28.6% — so in-answer visibility matters more than chasing AI referral clicks. AI visitors are higher-value per visit.
- Stanford HAI AI Index 2026 (2026-04) — 53% generative-AI population adoption in three years; $172B/yr consumer value (tripling); 73% expert vs 23% public positive on AI’s job impact. The macro adoption + sentiment context.
- Pew — How Americans View AI (Sept 2025) (n=5,023) — 95% awareness but 50% more concerned than excited; majorities expect AI to worsen creative thinking and relationships. The audience is large and aware but anxious, not enthusiastic — frame AI-driven experiences accordingly.
The cross-method reconciliation
The correlational studies all show schema-using pages are cited more — magnitudes from +6.5pp to 2.3× to r=0.63. The causal study says adding schema doesn’t cause that lift. Both are true if schema is a marker of editorial / technical / publication-infrastructure maturity rather than a cause of citation. Adding JSON-LD to an existing page doesn’t change the underlying publisher characteristics; matched DiD removes those characteristics from the lift calculation; correlation studies leave them in.
Practitioner implication: Ship schema — it’s cheap, it doesn’t hurt, and it still helps Google’s classical surfaces. Don’t expect adding it to be the lever that moves AI citations on its own. Spend the higher-leverage hours on the factors the meta-analysis ranks at 9+ (search rank, fan-out rank, preview control, query-answer match, intent-format match) and — per SE Ranking — on the domain-level authority/traffic that predicts AI Mode citation ~3× more than content quality.
Where the studies disagree on engines
The same on-page signal can have opposite effects across AI answer engines — and even across Google’s own two surfaces. Domain Authority is the cleanest example:
| Engine | DA / authority effect | Source |
|---|---|---|
| ChatGPT | No positive correlation; slight inverse in highest quartile | AirOps |
| Google AIO | Pearson +0.61; each 10-point bucket adds 1.4× citation odds | Digital Applied |
| Google AI Mode | Global domain traffic ~3× more predictive than content quality | SE Ranking |
| Brave + AIO + Perplexity (aggregated) | Authority & Trust r=0.59 | GEO-16 |
And the surfaces themselves barely overlap: AI Mode and AIO share only 13.7% of cited URLs for the same query (Ahrefs), while Google self-cites heavily in AI Mode (17.42% of all citations) (SE Ranking). Bottom line: a single AI-SEO playbook doesn’t fully exist — tactics must go engine-specific past a certain point.
Practitioner frameworks
- FLUQs — Friction-Inducing Latent Unasked Questions — Citation Labs’ content-strategy framework. EchoBlocks (causal triplets, FAQ entries, checklists) as the LLM-compression-resistant content format.
Companion empirical data
- Similarweb Most-Cited Domains in LLMs — which domains LLMs cite in practice. Overlaps with Digital Applied’s “top 1% capture 47%” and the SISTRIX “fixed core” finding.
The official source
- Google’s Generative AI Search Optimization Guide — Google’s official AI Overviews + AI Mode SEO guidance. Resolves the AEO/GEO terminology debate from the engine’s own perspective.
The pragmatic AI SEO playbook (2026-05-19 synthesis)
Distilled across the cluster — what the empirical evidence actually supports:
- Win classical SEO first — but know it now yields the densest citation, not the majority. Search Rank is #2 (9.7/10) on Zyppy; AirOps shows a 4.1× rank-1-to-rank-10 gap. But Ahrefs’s top-10 overlap dropped 76%→38%: top-10 is still the highest-yield per page, just no longer a majority of citations.
- Fan-out coverage is now load-bearing, not optional. ~Two-thirds of AIO citations come from outside the query’s own top-10 (Ahrefs); 88.6% of ChatGPT queries fan out to exactly 2 sub-queries (AirOps). Win the parent-topic cluster’s sub-query SERPs.
- Match heading-to-query intent. Headings at cosine ≥0.90 to the query cited 41.0% vs 30.2% at <0.50 (AirOps). Format-match: how-to → numbered list; what-is → definition + example; compare → table.
- Structure commercial pages by journey stage. Comparison pages → 3 tables (+25.7%); validation pages → list sections (+26.9%); awareness pages → 5-7 stats (+20%); shortlists → ≤10-word sentences (+18.8%) (AirOps “From Retrieved to Cited”). Lists are the strongest shared signal.
- Use focused content, not exhaustive coverage. 26-50% fan-out coverage beats 100% (AirOps). Word count 1,800-3,500; over 5,000 underperforms.
- Build domain-level authority + traffic, not just page-level. SE Ranking: global domain traffic predicts AI Mode citation ~3× more than content quality. “Top 1% capture 47%” (Digital Applied) and “fixed core of 1-5 domains” (SISTRIX) say the same — this is a domain-concentration game.
- Add inline named-source citations + 5-7 statistics. 2.1× lift (Digital Applied); GEO-16 Evidence & Citations r=0.61; AirOps +20% for stat-dense awareness pages.
- Ship Article + BreadcrumbList + FAQPage schema where editorially valid; HowTo for procedural pages. Correlations are consistent; the causal study says don’t expect a citation lift from this alone.
- Skip LLMs.txt. Score 2.0 (#23) on Zyppy. The most overhyped 2025 tactic.
- Go engine-specific where data diverges, and aim for the “core” not the “carousel.” DA matters on AIO (+0.61) and AI Mode (domain traffic 3×) but not ChatGPT (~0). AI Mode and AIO share only 13.7% of cited URLs. Target SISTRIX’s stable 1-5-domain core, not the weekly-rotating set. And calibrate total AI-SEO investment against Datos’s reality check — AI is still <2% of desktop visits.
Open thesis-cluster questions
- Q3 cross-engine follow-up from Digital Applied. ChatGPT + Perplexity replication of the 1,000-AIO methodology — will it confirm the AIO-vs-ChatGPT engine divergence?
- How fast is the Datos <2% figure moving, and does desktop-only undercount mobile + in-app AI? The market-context number gates how aggressively to invest.
- Will Gemini 4 move Ahrefs’s 38% top-10 number again? The 76%→38% drop landed five weeks after Gemini 3; treat 38% as a Q1-2026 snapshot, not equilibrium.
- Are GEO-16’s other 10 pillars going to have published correlations in v2? Currently only 6 of 16 have measured correlations.
- Does Google’s AI Mode self-citation (17.42%) keep climbing, and is it antitrust-exposed? SE Ranking’s tripling-in-nine-months trend is the one to watch.
Related operations on this wiki
- The
seo-content/topic (topic index) holds most of the article set; two macro studies live in ai-industry-research. This page is the curated AI-SEO subset. wiki/watchlist.mdtracksdigitalapplied.com/blog— the Q3 cross-engine follow-up is the highest-value watchlist signal.- Related Cloudflare wiki nav: claude-ai for AI tooling, ai-marketing for AI-driven content distribution, ai-industry-research for AI-industry-context studies.
What this hub is NOT
This is not an SEO operations playbook. The wiki’s internal SEO tooling — GSC Autonomous SEO, SEOmator audits, Clawdbot competitive intelligence, Blog-Agent-Worker content pipeline — sits in seo-content under internal article gates. This hub focuses on the publicly-citeable research and frameworks.
The empirical evidence on AI search citation is new. Through 2025 the AEO/GEO market was driven by anecdotal case studies (“we added schema and citations jumped”) that didn’t survive causal testing. The 2026 wave of matched-design + multi-engine + meta-analytical + longitudinal studies — Ahrefs (schema causal + top-10 longitudinal + AIO-vs-AI-Mode), AirOps, Digital Applied, GEO-16, SE Ranking (×2), SISTRIX, Zyppy, plus the Datos / Similarweb / Stanford HAI / Pew market-context layer — is what’s actually load-bearing. Bookmark this page; it’ll update as the cluster grows.