Source: wiki synthesis: Agent Wikis, Google’s Generative AI Search Optimization Guide, FLUQs, Every Level of a Claude Second Brain

An LLM-maintained wiki is usually framed as a private retrieval asset — “your moat is your data.” But the properties a compiled wiki has by construction (curated, structured, maintained, densely interlinked) are close to what the AI-citation research says AI search engines reward when deciding what to surface and reuse. This article connects the karpathy-pattern’s publish layer to that research: what to publish, what to keep private, and how to verify the citations.^[inferred] It is distinct from The Agent-Readable Web (a company site’s demand/interface/supply infrastructure) and from Karpathy-Pattern Third-Party Adoption (a survey of who runs the pattern) — this is about turning the knowledge vault itself into a citable public asset.

Key Takeaways

  • Compilation is the measurable value. Agent Wikis’ controlled eval — same model, same retrieval, same token budget — scored a compiled wiki at 89% correct / 7% hallucination vs 63% / 26% for RAG over the raw sources: “maintenance (curation) is the lever, not retrieval method.” That benchmark measures agents consuming a wiki, not search engines citing one — but it establishes the compiled form as the substrate machines answer best from.^[the extension from consumption to citation is inferred]
  • AI search wants what a maintained wiki produces. Google’s official guide says AI Overviews and AI Mode retrieve via RAG plus query fan-out over the ordinary Search index and reward non-commodity, unique-POV, clearly-organized content (“AI systems take a variety of sources — a unique viewpoint stands out”). FLUQs sharpens the target: the citation currency is net-new facts in compression-survivable structure — restated consensus does not get cited.
  • The publish shape already exists as a product. Agent Wikis publishes every wiki as free raw Markdown plus llms.txt, llms-full.txt, index.json, and an MCP server — and sells a Project tier (100/mo) where a company claims the public wiki for its product. Being the maintained public reference for a niche is already a commercial category, with Karpathy’s gist explicitly credited as the pattern.
  • FLUQ-shaped articles are the highest-yield public pages. FLUQs are the high-friction questions buyers never type into a search bar — zero keyword volume, invisible to Ahrefs — but synthesis-layer AI cites content that resolves them because the resolutions are net-new facts. A wiki whose ingest loop answers unasked questions in causal-triplet/checklist form is manufacturing exactly the fragments the emergent surface reuses.^[inferred]
  • One canonical article per concept is the anti-spam shape. Google explicitly warns that generating a page per query variation violates its scaled-content-abuse policy; the wiki pattern’s shape — one interlinked article per concept, updated in place on each ingest — is structurally the opposite of that failure mode.^[inferred]
  • The publish boundary has an operating rule: context vs connections. Evergreen, “still useful in a year” knowledge goes into the brain; fast-changing or private data (client records, Slack, email) stays out, with access-on-demand instead — and the same source flags that routing client data through a cloud model is not private. Applied at publish time: evergreen concept/tool/synthesis articles are the public GEO asset; client and operational data is the private moat that never gets a publish flag.^[inferred application]
  • Eligibility is boring and mandatory. To appear in AI Overviews / AI Mode a page must be indexed and snippet-eligible in ordinary Google Search. A published wiki that blocks crawlers, noindexes itself, or sets nosnippet is invisible to the RAG layer regardless of content quality.

Resolving the Tension: If Data Is the Moat, Why Publish?

The second-brain framing (“your moat is your data / IP”) and the GEO framing (publish so AI engines cite you) look opposed but split cleanly by where the value lives:^[inferred]

  • Publish when the value is being found. Concept explainers, tool comparisons, technique write-ups, and FLUQ resolutions gain value from citation — they are marketing surface. Google/Danny Sullivan’s advice (“make non-commoditized content, give us new data, ground AI Mode in fact”) comes with no attribution guarantee, so what you publish should be content you’re willing to have reused as influence, not only as traffic.
  • Keep private when the value is exclusivity. Client data, internal ops, win/loss specifics, anything in the “connections” class — the fast-changing private data the second-brain taxonomy says not to ingest into the evergreen layer at all, let alone publish.
  • The pattern supports per-article decisions. Since the vault is just markdown files and folders (tool-agnostic by design), the publish boundary can be an explicit per-article flag rather than an all-or-nothing choice.^[inferred]

What the Wiki Pattern Already Does That GEO Guides Ask For

AI-citation requirement (sourced)Wiki-pattern property
Non-commodity, unique POV (Google); net-new facts (FLUQs)Compiled synthesis rather than mirrored sources — curation is the measured lever (Agent Wikis); the synthesis/connection layer is where net-new fragments come from^[inferred]
Organized for human readers — clear headings, sections (Google)Index → drill-down → article navigation; raw/ → wiki/ → schema structure (Agent Wikis; second-brain level 2)
Compression-survivable fragments — causal triplets wrapped in FAQ/checklist containers (FLUQs)Scannable, bulleted article format is already that container shape^[inferred]
Freshness in fast-moving domainsThe flagship Agent Wikis case is Hermes — a product releasing every 3–5 days — where the compiled wiki beat live web search 89% to 48%
Emergent-surface access — “your MCP” as a publishing surface (FLUQs)Agent Wikis ships an MCP server plus llms.txt per wiki — the FLUQs emergent surface, implemented

Verifying You’re Actually Cited

  • Google side: run GSC URL Inspection on top wiki pages; confirm indexed + snippet-eligible; check robots meta, nosnippet, and Google-Extended configuration so you aren’t blocking the surfaces you’re optimizing for.
  • Reuse side: FLUQs’ discipline is publish, then monitor which fragments get picked up by RAG pipelines, AI Overviews, or agentic workflows; Citation Labs’ XOFU is the tracked reuse-measurement tool (early, not independently validated).
  • Agent side: curl your own llms.txt and point an agent at it — Agent Wikis’ benchmark shows “wiki first, web on gaps” routing scored 93%, above either source alone, which is the consumption pattern a published wiki plugs into.
  • Caveat worth holding: the meta-analysis summarized alongside Google’s guide scores llms.txt near the bottom of citation factors (2.0/10) while search rank scores near the top — so treat llms.txt as an agent-consumption affordance, and indexability + content quality as the actual citation levers.^[inferred reconciliation of two sourced claims]

Try It

  1. Pick 10 evergreen articles from your vault that contain something not already restated across the public web — apply the net-new-fact test, not a word count — and publish them as plain, crawlable pages with clear headings.
  2. Expose llms.txt and full-text Markdown alongside the HTML — the exact agent-native configuration Agent Wikis publishes for every wiki.
  3. Run the eligibility audit: GSC URL Inspection on each published page; fix anything not indexed or not snippet-eligible before touching content.
  4. Mine FLUQs from your own niche using the four identification questions (what’s not being asked; whose voice is missing; where do models hallucinate; what’s missing in currently-cited resources) and write one wiki article per FLUQ, formatted as a causal triplet plus checklist.
  5. Sort context vs connections at every ingest so private client/operational data never drifts toward the public set.
  6. Monitor monthly: prompt the major assistants with your niche’s bottom-funnel questions and track whether your pages or fragments surface; add XOFU if you want fragment-level reuse measurement.

Open Questions

  • No source directly measures whether published karpathy-pattern wikis get cited by AI Overviews, ChatGPT, or Perplexity. The Agent Wikis eval measures agent-RAG accuracy over a wiki; the citation claim composes that with Google’s and FLUQs’ guidance but remains a projection until someone instruments it.^[inferred]
  • Attribution economics are unresolved. Google offers no attribution guarantee for grounding AI Mode in your facts; whether wiki publishing wins traffic, zero-click influence, or nothing measurable is an open empirical question the FLUQs sources themselves flag.
  • Freshness expectations of AI crawlers vs. a personal vault’s ingest cadence — the Hermes flagship case updates on a 3–5 day product cycle; whether a slower-moving personal or agency wiki clears the freshness bar AI surfaces reward is untested in these sources.