Source: ahrefs-schema-ai-citations-study-2026-05-11.md — Louise Linehan & Xibeijia Guan, reviewed by Ryan Law. Published 2026-05-11 on the Ahrefs Blog.
Ahrefs ran a difference-in-differences causal study tracking 1,885 pages that added JSON-LD schema between August 2025 and March 2026 against a 4,000-page control set. They measured whether the schema additions moved AI citations across Google AI Overviews, Google AI Mode, and ChatGPT in the 30 days following the change. Headline: schema did not produce a statistically meaningful lift on any of the three surfaces. AI Overviews showed a 4.6 percent decline that was statistically significant but inside a larger declining trend, so the authors stop short of calling it a schema-caused drop. This is the first causal-inference (not correlational) study on the schema → AI-citation question.
Key Takeaways
- Causal design. Matched difference-in-differences. 1,885 pages adding schema (Aug 2025-Mar 2026) vs. 4,000 control pages matched on industry, traffic, and pre-existing AI citation rate. 30-day post-window. The control set isolates schema as the independent variable.
- Three surfaces, no lift. AI Overviews, AI Mode, and ChatGPT all tested. None showed a positive citation effect from adding schema.
- AIO decline (4.6%) is statistically significant but contextually small. It sits inside a larger declining trend in AIO citation rates across the period, and the authors do not claim schema caused the decline.
- Schema types tested: Article, Product, FAQ, HowTo, Organization, Person, BreadcrumbList. No subtype was a winner.
- Confirms FLUQs framework + Google’s official position. Both FLUQs and Google’s Generative AI Search Optimization guide argue that AI search ranks on the same core signals as classical search — schema helps Google parse your page, but doesn’t independently bump citation odds. Ahrefs’s data is the empirical confirmation.
- Practical implication. Add schema for the structured-data-richness reasons Google still rewards (rich snippets, knowledge panel eligibility, parseability for AI Overviews source attribution). Don’t add schema expecting an AI-citation lift — it isn’t there. Effort better spent on content quality, query-answer matching, and the higher-leverage factors ranked in the Zyppy meta-analysis (search rank, fan-out rank, preview control, intent-format match).
- Contradicts the 2025 anecdotal narrative. Through 2025 the AEO/GEO market was full of “we added schema and AI citations jumped” case studies. Those were uncontrolled. Ahrefs’s matched design makes the case studies look like regression-to-the-mean noise.
Ahrefs causal null vs. four correlational studies showing schema lift
Ahrefs says (this article, matched DiD, 1,885 pages, three surfaces) — adding schema produces no statistically meaningful AI citation lift. Four correlational studies say schema-using pages ARE cited more:
- AirOps (16,851-query ChatGPT study, stratified): +6.5pp citation advantage for JSON-LD pages (38.5% vs 32.0%).
- Digital Applied (1,000-AIO study, regression-style DA control): 2.3× lift for Article + BreadcrumbList; 2.8× for HowTo.
- GEO-16 arXiv (1,702 citations across Brave/AIO/Perplexity, cross-sectional): Structured Data r=0.63, +39% citation impact, 95% CI [0.59, 0.67].
- Zyppy meta-analysis (54 studies aggregated): Structured Data 5.6 / 10 (#20 of 23) — mid-tier but present.
Reconciliation: All four correlational studies cannot match on unobserved publisher characteristics (editorial maturity, technical SEO depth, content team composition) the way Ahrefs’s matched DiD does. The most parsimonious interpretation: schema is a marker of editorial / technical / publication-infrastructure maturity that correlates strongly with citation, not the lever itself. Adding schema to an existing page (Ahrefs’s intervention) doesn’t cause the lift because the page’s underlying characteristics determine citation candidacy. Practitioner: keep your schema (still helps Google’s classical surfaces — rich snippets, knowledge panel eligibility — and is the right table-stakes signal per AirOps); don’t expect adding it to be the lever that moves AI citations. Status: resolved (2026-05-19) — methodological-difference, not factual.
Study Design Details
- Pages tracked: 1,885 treatment + 4,000 control = 5,885 total.
- Treatment definition: First-time addition of JSON-LD schema between Aug 2025 and Mar 2026.
- Matching variables: Industry vertical, pre-treatment organic traffic band, pre-treatment AI citation count (within ±2 of treatment page).
- Outcome variables: AI citation count on AI Overviews, AI Mode, and ChatGPT in the 30-day window post-schema-addition.
- Estimator: Difference-in-differences (DiD) — comparing the change in AI citations between treatment and control, not absolute citation levels.
- Schema types in the treatment set: Article (most common), Product, FAQ, HowTo, Organization, Person, BreadcrumbList. The study breaks results by type and finds no positive subtype.
- Caveats Ahrefs flags: AIO and AI Mode citation tracking has measurement noise from snapshot vs. live divergence; 30-day window may be too short for slow-moving AI training cycles (though ChatGPT-search results are RAG, not training, so this matters less for that surface); some treatment pages added other content concurrently with schema, which DiD partially controls for via the matched control.
Open Questions
- Long-window effect. Could a 90-day or 180-day window detect lift the 30-day window missed? Ahrefs notes this as an explicit follow-up.
- AEO-specific schema (Speakable, ClaimReview, QAPage). The study tested the common schema types. It did not test AEO-specific subtypes that some practitioners argue are the actual lever. ^[inferred]
- Effect on which-schema-engines vs. which-don’t. The study aggregates across AIO, AI Mode, and ChatGPT. Per-surface breakdowns would isolate which engine, if any, uses schema in its retrieval.
Related
- AirOps + Kevin Indig Fan-Out Effect ChatGPT Study — Largest single-engine correlational study. +6.5pp schema lift in stratified analysis. Same direction-of-effect, methodologically weaker than Ahrefs’s matched DiD.
- Digital Applied 1,000 AIO Citation Pattern Study — AIO-only correlational study with regression-style DA control. 2.3× lift claim. See contradiction callout for reconciliation.
- GEO-16 Framework (arXiv 2509.10762v1) — Academic cross-sectional study across Brave/AIO/Perplexity. Structured Data r=0.63, +39%, 95% CI [0.59, 0.67]. Authors explicitly self-flag observational design.
- Zyppy AI Citation Ranking Factors — Cyrus Shepard’s parallel meta-analysis of 54 studies. Scores Structured Data 5.6 / 10 (#20 of 23 factors). The most-aggregated view across the literature.
- FLUQs Framework — Core SEO first, AEO-specific tactics second. Ahrefs’s data is the empirical confirmation.
- Google’s Generative AI Search Optimization Guide — Google’s official position: AIO/AIM rely on the same Search index, so AI search optimization is core SEO. Ahrefs’s study is the independent causal validation.
- Similarweb Most-Cited Domains in LLMs — What domains LLMs actually cite, useful for benchmarking what “winning AI citations” looks like.
Try It
- Don’t pull schema if you have it — it still helps Google’s classical surfaces (rich snippets, knowledge panels) and the Ahrefs result is “no lift,” not “negative effect.”
- Don’t add schema primarily to chase AI citations. The opportunity cost is real; spend the engineering hours on higher-ranked factors from Zyppy’s analysis instead.
- Re-evaluate AEO consultants charging premium rates for schema-focused engagements. Ask whether their case studies are causally-designed or just correlated.
- Tell your client / stakeholder that “schema is required for AI citations” was the 2025 narrative and the 2026 data contradicts it. Frame schema work as Google-parseability infrastructure, not AI-specific.