GBrain — Garry Tan's Open-Source AI Brain (and YC's "Every Company Needs One" Thesis)

Source: raw/Garry_Tan_open-sourced_his_AI_brain._YC_says_every_company_will_need_one..md (youtube.com/watch?v=OssShK7zzVs — strategy / YC framing) + raw/GBrain_Github_Architecture_Breakdown_-_AI_Retrieval_Memory_and_Precision.md (youtube.com/watch?v=xFpeMUDsgF4 — technical architecture & benchmark)

2026-06-28 — refreshed from the live repo github.com/garrytan/gbrain (MIT)

The repo is public and resolves this article’s original Open Questions. Confirmed: MIT license, owner-path garrytan/gbrain, and PGLite = Postgres 17 via WASM (the zero-config default). Production stats have grown ~8× since the source videos: 146,646 pages, 24,585 people, 5,339 companies, 66 cron jobs, 43 curated skills (the videos showed 17,888 / 4,383 / 723 / 21 / 29). The tagline is now “Search gives you raw pages. GBrain gives you the answer.” Most significantly, the “company brain” the YC thesis called unsolved now ships in the repo: the README says GBrain “works as a company brain too,” with per-user access scoping — “each person on the team gets their own slice of the brain, scoped by login.” The 49.1% P@5 / ~98% recall benchmark below matches the README. (Sources: raw/gh-star-garrytan-gbrain.md (24,391★, 2026-06-27); ai-research/gbrain-readme-facts-2026-06-28.md.)

GBrain is Garry Tan’s (President & CEO of Y Combinator) open-sourced personal AI brain — a memory layer that gives a “smart but forgetful” agent persistent, structured recall. Markdown files on disk are the source of truth; a self-wiring knowledge graph and a hybrid vector/lexical retriever sit on top; a nightly “dream cycle” reorganizes the store while the user sleeps. The headline claim: a graph that wires itself with regex (zero LLM calls) on every page write beats vector RAG by 31.4 points of precision@5 on Tan’s own benchmark. YC then named the company version of this — “Garry’s GBrain, but for every business in the world” — an investable category in its Summer 2026 Request for Startups.

The two source videos are complementary, not redundant:

Video 1 (OssShK7zzVs) — an “Architect’s Lens” autopsy: the YC / Request-for-Startups framing, the four load-bearing design claims, the Minion vs sub-agent routing test, the dream cycle, and the market implications.
Video 2 (xFpeMUDsgF4) — the technical architecture and the adversarial evals repo: the four baseline retrieval paradigms, the chaotic-corpus benchmark, LLM-as-judge, the ranking “density imperative,” and the precise numbers.

Key Takeaways

What GBrain is, in plain terms: markdown files on disk (source of truth) + a retrieval layer (Postgres with pgvector, or an embedded option Tan calls “PGLite”) + 29 markdown skill files + a nightly dream cycle. Tan’s published production stats at the time of the source videos: 17,888 pages, 4,383 people, 723 companies, 21 cron jobs, built in 12 days — his own real data, not a demo (the live repo now reports ~146,646 pages / 43 skills / 66 crons — see the update note above).
“Skill files are code. The runtime is dumb.” Intelligence lives in markdown a human can read, version, and audit. A single file (the resolver) routes intent to the right skill — there is no orchestrator, no planner, no chain. Tan’s line: “If you can’t read your agent’s instructions, you don’t own them.” Transferable principle: plain text outlives code.
The graph wires itself — regex over LLM. On every page write, GBrain extracts entity references with regex and infers typed edges (attended, works-at, invested-in, founded, advises) from the page’s role and surrounding text. Zero language-model calls — the graph builds cheaply on every save. Transferable principle: the graph wires itself; regex over language model where it can be.
Knowledge graphs beat vector retrieval for relationship questions. Queries like “who works at Acme AI?” or “what has Bob invested in this quarter?” are relationship questions; cosine similarity over text embeddings can’t reach them, but a typed link traversal can.
The benchmark numbers (BrainBench V1, 240-page rich-prose corpus): full system 49.1% precision@5; +31.4 points vs the same codebase with the graph layer disabled (true ablation, same corpus/embedder/pipeline); +38 points vs the vector-RAG baseline; recall@5 ~98%. Graph-only retrieval drops returned document payload by 53% while doubling set precision — Tan’s “density imperative.”
Determinism is a routing decision, not a fallback. In a real ingest task under ~19 concurrent cron jobs: a sub-agent route timed out (>10,000 ms gateway timeout, ~3¢/run, 0% success — couldn’t even spawn under load); a deterministic background job (a “Minion”) ran in 753 ms, $0 token cost, 100% success. Rule: deterministic work → Minions; judgment work → sub-agents. “Most of what we call agent work is the first thing pretending to be the second.”
Memory should be a process, not a snapshot. The dream cycle’s 8 phases (Lint, Backlinks, Sync, Synthesize, Extract, Patterns, Embed, Orphans) distill transcripts → reflections → long-term “25-year patterns,” audit citations, re-link orphans, and refresh stale embeddings overnight. The store is reorganized, not just searched: “I wake up and the brain is smarter than when I went to sleep.”
Two named failure modes, two layers. Most “AI agent” failures aren’t intelligence failures — they’re amnesia (no persistent world model → fix with the knowledge-graph world-knowledge layer) and mis-routing (deterministic work sent through a reasoning model → fix with the Minion operational-logic layer). “Separate world knowledge from operational logic. Markdown is durable. Chains aren’t.”
The YC thesis: YC GP Tom Blomfield (founder of Monzo and GoCardless) named the exact code as the prototype for a category: “We need Garry’s GBrain, but for every business in the world.” Company brain is one of four pieces in YC’s Summer 2026 RFS — and the only piece YC built itself.
Where the thesis might break (narrowing). GBrain started personal — one user, one brain. As of 2026-06-28 the repo also ships a company-brain mode with per-login access scoping (“each person gets their own slice”), so the basic multi-tenancy + access-control gap the YC thesis named is now partly addressed in the open-source code itself. The harder pieces — write-conflict resolution and consensus across teams — are still not described in the README and remain the substance of the investment thesis. ^[inferred — the README confirms per-user scoping; “write-conflict/consensus still unsolved” is inferred from its absence in the README]

The Benchmark — GBrain Evals (Video 2)

A separate GBrain evals repository holds a deterministic, adversarial, public benchmark suite for personal knowledge agents — strictly separated from any real install so a user’s actual graph is never poisoned with synthetic entities.

Corpus: the Amara Life v1 corpus — ~4 MB of fictionalized text simulating a week in the life of a fictional philosopher of science: 300 Slack messages, overlapping meeting transcripts, dense theoretical discussions requiring exact entity linking across abstract concepts. Built to replicate the messy, contradictory reality of a real digital footprint, unlike clean encyclopedic/corporate benchmarks.
Planted adversarial traps: stale facts (chronologically obsolete but lexically close — to expose pure vector similarity), paraphrased semantic-poison injections, and direct logical contradictions — forcing the system to weigh chronological evidence rather than hallucinate a merged answer.
Four baseline paradigms tested: (1) grep-only (exact lexical matching, lowest tier); (2) vector RAG (industry standard — solves vocabulary gaps, fails precise temporal/relational queries); (3) hybrid = “GBrain without the graph” (vector + lexical merged via reciprocal rank fusion, but blind to ontology); (4) full GBrain stack (typed entity links extracted at ingest). The full stack is compared against its own graph-disabled twin to isolate the delta from relational mapping alone.
LLM-as-judge under a structured evidence contract: the judge is barred from returning a boolean pass/fail; it must extract the exact source evidence and bind its ruling to hidden ground-truth metadata arrays in the corpus — eliminating false positives where a model hallucinates correct trivia from its own pretraining.
The ranking “density imperative”: models read only the top-K results, so GBrain forces exact, relationally-typed graph hits to the absolute top of the array, then backfills the remaining context window with looser vector/grep results — returning drastically fewer but flawlessly structured documents.

The YC Thesis — Every Company Needs a Brain (Video 1)

When YC’s CEO ships open-source code and a YC partner names that exact code as the prototype for a fundable category, “that isn’t coincidence — that’s a thesis.” ^[inferred — framing is the explainer’s, attributed to the YC RFS]
Three market shifts if Blomfield is right: (1) vector databases compress as a category, with knowledge graphs as the default substrate; (2) chain-of-LLM-call orchestration frameworks become legacy infrastructure — “the framework that makes the agent dumber wins”; (3) sub-agent spawning gets a Minions filter in front of it, dropping token bills an estimated 50–80% on workloads that are mostly queued shell work pretending to be reasoning.

Implementation

Repo: github.com/garrytan/gbrain (MIT; TypeScript ~97%). README tagline (2026-06-28): “Search gives you raw pages. GBrain gives you the answer.” A separate gbrain-evals repo holds the public BrainBench benchmark scorecards. (Confirmed from the live repo; the source videos did not state the URL/license.)
Stack / Setup: markdown files on disk (source of truth) → Postgres + pgvector retrieval (or embedded “PGLite”) → 29 markdown skill files → a resolver intent-router (no orchestrator/planner/chain) → an 8-phase dream cycle on a cron schedule. Pluggable storage engines scale the same architecture from an in-memory test sandbox to persistent cloud without code rewrites.
Retrieval architecture: hybrid (vector + lexical/grep merged with reciprocal rank fusion) plus a self-wiring typed knowledge graph. On every write, regex extracts entities and infers typed edges (zero LLM calls). Ranking forces exact typed graph hits to the top, then backfills with looser semantic/grep hits.
Memory architecture: the dream cycle (Lint, Backlinks, Sync, Synthesize, Extract, Patterns, Embed, Orphans) runs overnight — distilling transcripts → reflections → long-term “25-year patterns,” auditing citations, re-linking orphans, refreshing stale embeddings. Memory is reorganized, not merely appended.
Precision results: 49.1% P@5 (240-page BrainBench V1); +31.4 pts vs the graph-disabled twin; +38 pts vs vector-RAG baseline; recall@5 ~98%; graph-only retrieval cuts payload 53% while doubling set precision.
Operational routing: deterministic jobs → Minions (753 ms, $0 tokens, 100% under ~19-cron load); judgment jobs → sub-agents (route timed out >10 s, ~3¢, 0% under load).
Production skill example: a “CEO review mode” skill that, when a user submits an engineering architecture, pushes back on unstated assumptions and locks down architectural drift by referencing complete historical data — contrasted against passive chatbots that “suffer from sycophancy and lose focus over long contexts.”
Cost: open source (MIT), self-hosted; graph extraction is regex (no per-write inference cost).

matt-wolfe-ai-second-brain — another personal “second brain” (Codex automations over a vault); GBrain is the knowledge-graph-backed cousin.
agentwikis-platform — hosted multi-wiki service; closest analog to the unsolved company/multi-tenant “brain” layer YC wants funded.
synthadoc — a self-maintaining vault with lint/ingest agents; GBrain’s dream cycle is the same “memory is a process” idea on a cron.
build-llm-wiki-for-business-walkthrough — building an LLM-maintained wiki for a business; directly parallels the “every company needs a brain” pitch.
karpathy-pattern-third-party-adoption — GBrain is a high-profile (YC CEO) adoption of the markdown-source-of-truth pattern.
garrytan-gstack — Garry Tan’s other open-source release (his 23-tool Claude Code skill pack); same author, the operational sibling to GBrain’s memory.
qmd-hybrid-search — BM25 + vector + LLM-rerank over markdown (the engine this vault runs); a direct point of comparison to GBrain’s hybrid retriever — minus the typed graph.
Wiring Gbrain into Hermes Agent as an MCP Server — the concrete Hermes MCP-integration how-to this architecture article doesn’t cover.
Dmitry Shapiro’s Multi-Agent Hermes Architecture — a real production deployment running GBrain as the CRM/relationship layer behind three coordinated Hermes agents.

Try It

Study the structure, not just the repo: markdown source-of-truth + flat skill files + a resolver, instead of a chain or planner. The bet is auditability — instructions you can read and version.
Add a regex entity-extraction pass on write to your own vault to build a typed knowledge graph alongside existing vectors, then rank exact typed hits above semantic ones. ^[inferred application]
Audit your agent stack for “deterministic work pretending to be judgment work.” Move queued/shell steps to deterministic background jobs (the Minion pattern) before reaching for another model.
Run an overnight “dream cycle” over your knowledge base — lint, backlink, synthesize reflections, re-embed stale vectors, re-link orphans — so the store appreciates rather than ages.
Builders: the company-brain layer (multi-tenancy, ACLs, write-conflict resolution, team consensus) is YC’s explicitly named, unsolved RFS opportunity.
What to ingest first (operator checklist, @shannholmberg, 2026-06-25): for a company GBrain, prioritize in this order — (1) source-of-truth docs (positioning, pricing, ICP, brand rules, FAQs); (2) workflows/SOPs (so agent specialists reference existing process); (3) taste examples (“what good looks like” + a note on why); (4) decision logs (what changed, why, who decided, rejected alternatives — stops the agent relitigating settled calls); (5) ownership map (workflow/decision/approval owners, SMEs, escalation path); (6) customer & market context (calls, tickets, churn reasons, objections, competitor/win-loss notes); (7) permissions & boundaries (read/write/publish/send/approval scopes); (8) feedback loops (human edits, QA notes, what changed after review). ^[inferred — single-operator checklist] Mirrors the same Capture/Source-Truth/Permissions/Feedback layering in Eric Siu’s “Company Brain”.
Compare to what this vault already runs: qmd-hybrid-search gives you hybrid retrieval today; GBrain’s distinctive add is the self-wiring typed graph on top.

Open Questions

~~Exact GitHub URL / owner-path~~ Resolved (2026-06-28): github.com/garrytan/gbrain; the benchmark repo is gbrain-evals.
~~License~~ Resolved (2026-06-28): MIT (matches Tan’s gstack).
~~“PGLite” naming~~ Resolved (2026-06-28): the README describes PGLite as “Postgres 17 via WASM” — i.e. the existing PGlite WASM-Postgres approach, used as the zero-config default.
Independent verification — all benchmark numbers (49.1% P@5, +31.4/+38 pts, ~98% recall@5) are Tan’s own (now in the repo README + gbrain-evals), relayed via third-party explainers; not independently reproduced.
Reconciliation with this wiki’s stance — the Karpathy LLM-wiki pattern historically favored grep + index files over vector DBs; GBrain argues the next step is a typed knowledge graph on top of vectors. Worth tracking as the pattern evolves.

Jonathon's AI Wiki

Explorer

GBrain — Garry Tan's Open-Source AI Brain (and YC's "Every Company Needs One" Thesis)

Key Takeaways

The Benchmark — GBrain Evals (Video 2)

The YC Thesis — Every Company Needs a Brain (Video 1)

Implementation

Try It

Open Questions

Graph View

Table of Contents

Backlinks

Jonathon's AI Wiki

Explorer

GBrain — Garry Tan's Open-Source AI Brain (and YC's "Every Company Needs One" Thesis)

Key Takeaways

The Benchmark — GBrain Evals (Video 2)

The YC Thesis — Every Company Needs a Brain (Video 1)

Implementation

Related

Try It

Open Questions

Graph View

Table of Contents

Backlinks