Source: raw/Garry_Tan_open-sourced_his_AI_brain._YC_says_every_company_will_need_one..md (youtube.com/watch?v=OssShK7zzVs — strategy / YC framing) + raw/GBrain_Github_Architecture_Breakdown_-_AI_Retrieval_Memory_and_Precision.md (youtube.com/watch?v=xFpeMUDsgF4 — technical architecture & benchmark)
GBrain is Garry Tan’s (President & CEO of Y Combinator) open-sourced personal AI brain — a memory layer that gives a “smart but forgetful” agent persistent, structured recall. Markdown files on disk are the source of truth; a self-wiring knowledge graph and a hybrid vector/lexical retriever sit on top; a nightly “dream cycle” reorganizes the store while the user sleeps. The headline claim: a graph that wires itself with regex (zero LLM calls) on every page write beats vector RAG by 31.4 points of precision@5 on Tan’s own benchmark. YC then named the company version of this — “Garry’s GBrain, but for every business in the world” — an investable category in its Summer 2026 Request for Startups.
The two source videos are complementary, not redundant:
- Video 1 (OssShK7zzVs) — an “Architect’s Lens” autopsy: the YC / Request-for-Startups framing, the four load-bearing design claims, the Minion vs sub-agent routing test, the dream cycle, and the market implications.
- Video 2 (xFpeMUDsgF4) — the technical architecture and the adversarial evals repo: the four baseline retrieval paradigms, the chaotic-corpus benchmark, LLM-as-judge, the ranking “density imperative,” and the precise numbers.
Key Takeaways
- What GBrain is, in plain terms: markdown files on disk (source of truth) + a retrieval layer (Postgres with pgvector, or an embedded option Tan calls “PGLite”) + 29 markdown skill files + a nightly dream cycle. Tan’s published production stats: 17,888 pages, 4,383 people, 723 companies, 21 cron jobs, built in 12 days — his own real data, not a demo.
- “Skill files are code. The runtime is dumb.” Intelligence lives in markdown a human can read, version, and audit. A single file (the resolver) routes intent to the right skill — there is no orchestrator, no planner, no chain. Tan’s line: “If you can’t read your agent’s instructions, you don’t own them.” Transferable principle: plain text outlives code.
- The graph wires itself — regex over LLM. On every page write, GBrain extracts entity references with regex and infers typed edges (
attended,works-at,invested-in,founded,advises) from the page’s role and surrounding text. Zero language-model calls — the graph builds cheaply on every save. Transferable principle: the graph wires itself; regex over language model where it can be. - Knowledge graphs beat vector retrieval for relationship questions. Queries like “who works at Acme AI?” or “what has Bob invested in this quarter?” are relationship questions; cosine similarity over text embeddings can’t reach them, but a typed link traversal can.
- The benchmark numbers (BrainBench V1, 240-page rich-prose corpus): full system 49.1% precision@5; +31.4 points vs the same codebase with the graph layer disabled (true ablation, same corpus/embedder/pipeline); +38 points vs the vector-RAG baseline; recall@5 ~98%. Graph-only retrieval drops returned document payload by 53% while doubling set precision — Tan’s “density imperative.”
- Determinism is a routing decision, not a fallback. In a real ingest task under ~19 concurrent cron jobs: a sub-agent route timed out (>10,000 ms gateway timeout, ~3¢/run, 0% success — couldn’t even spawn under load); a deterministic background job (a “Minion”) ran in 753 ms, $0 token cost, 100% success. Rule: deterministic work → Minions; judgment work → sub-agents. “Most of what we call agent work is the first thing pretending to be the second.”
- Memory should be a process, not a snapshot. The dream cycle’s 8 phases (Lint, Backlinks, Sync, Synthesize, Extract, Patterns, Embed, Orphans) distill transcripts → reflections → long-term “25-year patterns,” audit citations, re-link orphans, and refresh stale embeddings overnight. The store is reorganized, not just searched: “I wake up and the brain is smarter than when I went to sleep.”
- Two named failure modes, two layers. Most “AI agent” failures aren’t intelligence failures — they’re amnesia (no persistent world model → fix with the knowledge-graph world-knowledge layer) and mis-routing (deterministic work sent through a reasoning model → fix with the Minion operational-logic layer). “Separate world knowledge from operational logic. Markdown is durable. Chains aren’t.”
- The YC thesis: YC GP Tom Blomfield (founder of Monzo and GoCardless) named the exact code as the prototype for a category: “We need Garry’s GBrain, but for every business in the world.” Company brain is one of four pieces in YC’s Summer 2026 RFS — and the only piece YC built itself.
- Where the thesis might break: GBrain is personal — one user, one brain (solved). The company version (multi-tenancy, access control, write-conflict resolution, consensus across teams) is not solved publicly. That gap is the entire investment thesis.
The Benchmark — GBrain Evals (Video 2)
A separate GBrain evals repository holds a deterministic, adversarial, public benchmark suite for personal knowledge agents — strictly separated from any real install so a user’s actual graph is never poisoned with synthetic entities.
- Corpus: the Amara Life v1 corpus — ~4 MB of fictionalized text simulating a week in the life of a fictional philosopher of science: 300 Slack messages, overlapping meeting transcripts, dense theoretical discussions requiring exact entity linking across abstract concepts. Built to replicate the messy, contradictory reality of a real digital footprint, unlike clean encyclopedic/corporate benchmarks.
- Planted adversarial traps: stale facts (chronologically obsolete but lexically close — to expose pure vector similarity), paraphrased semantic-poison injections, and direct logical contradictions — forcing the system to weigh chronological evidence rather than hallucinate a merged answer.
- Four baseline paradigms tested: (1) grep-only (exact lexical matching, lowest tier); (2) vector RAG (industry standard — solves vocabulary gaps, fails precise temporal/relational queries); (3) hybrid = “GBrain without the graph” (vector + lexical merged via reciprocal rank fusion, but blind to ontology); (4) full GBrain stack (typed entity links extracted at ingest). The full stack is compared against its own graph-disabled twin to isolate the delta from relational mapping alone.
- LLM-as-judge under a structured evidence contract: the judge is barred from returning a boolean pass/fail; it must extract the exact source evidence and bind its ruling to hidden ground-truth metadata arrays in the corpus — eliminating false positives where a model hallucinates correct trivia from its own pretraining.
- The ranking “density imperative”: models read only the top-K results, so GBrain forces exact, relationally-typed graph hits to the absolute top of the array, then backfills the remaining context window with looser vector/grep results — returning drastically fewer but flawlessly structured documents.
The YC Thesis — Every Company Needs a Brain (Video 1)
- When YC’s CEO ships open-source code and a YC partner names that exact code as the prototype for a fundable category, “that isn’t coincidence — that’s a thesis.” ^[inferred — framing is the explainer’s, attributed to the YC RFS]
- Three market shifts if Blomfield is right: (1) vector databases compress as a category, with knowledge graphs as the default substrate; (2) chain-of-LLM-call orchestration frameworks become legacy infrastructure — “the framework that makes the agent dumber wins”; (3) sub-agent spawning gets a Minions filter in front of it, dropping token bills an estimated 50–80% on workloads that are mostly queued shell work pretending to be reasoning.
Implementation
- Repo: GBrain (open-sourced by Garry Tan; styled “G Brain” in the repo’s opening line — “Your AI agent is smart but forgetful. GBrain gives it a brain.”). A separate GBrain evals repo holds the public benchmark. The explicit GitHub URL / owner-path is not stated in either transcript. ^[inferred — only the project name is given in-source]
- Stack / Setup: markdown files on disk (source of truth) → Postgres + pgvector retrieval (or embedded “PGLite”) → 29 markdown skill files → a resolver intent-router (no orchestrator/planner/chain) → an 8-phase dream cycle on a cron schedule. Pluggable storage engines scale the same architecture from an in-memory test sandbox to persistent cloud without code rewrites.
- Retrieval architecture: hybrid (vector + lexical/grep merged with reciprocal rank fusion) plus a self-wiring typed knowledge graph. On every write, regex extracts entities and infers typed edges (zero LLM calls). Ranking forces exact typed graph hits to the top, then backfills with looser semantic/grep hits.
- Memory architecture: the dream cycle (Lint, Backlinks, Sync, Synthesize, Extract, Patterns, Embed, Orphans) runs overnight — distilling transcripts → reflections → long-term “25-year patterns,” auditing citations, re-linking orphans, refreshing stale embeddings. Memory is reorganized, not merely appended.
- Precision results: 49.1% P@5 (240-page BrainBench V1); +31.4 pts vs the graph-disabled twin; +38 pts vs vector-RAG baseline; recall@5 ~98%; graph-only retrieval cuts payload 53% while doubling set precision.
- Operational routing: deterministic jobs → Minions (753 ms, $0 tokens, 100% under ~19-cron load); judgment jobs → sub-agents (route timed out >10 s, ~3¢, 0% under load).
- Production skill example: a “CEO review mode” skill that, when a user submits an engineering architecture, pushes back on unstated assumptions and locks down architectural drift by referencing complete historical data — contrasted against passive chatbots that “suffer from sycophancy and lose focus over long contexts.”
- Cost: open source, self-hosted; graph extraction is regex (no per-write inference cost). License not stated in either source. ^[inferred]
Related
- matt-wolfe-ai-second-brain — another personal “second brain” (Codex automations over a vault); GBrain is the knowledge-graph-backed cousin.
- agentwikis-platform — hosted multi-wiki service; closest analog to the unsolved company/multi-tenant “brain” layer YC wants funded.
- synthadoc — a self-maintaining vault with lint/ingest agents; GBrain’s dream cycle is the same “memory is a process” idea on a cron.
- build-llm-wiki-for-business-walkthrough — building an LLM-maintained wiki for a business; directly parallels the “every company needs a brain” pitch.
- karpathy-pattern-third-party-adoption — GBrain is a high-profile (YC CEO) adoption of the markdown-source-of-truth pattern.
- garrytan-gstack — Garry Tan’s other open-source release (his 23-tool Claude Code skill pack); same author, the operational sibling to GBrain’s memory.
- qmd-hybrid-search — BM25 + vector + LLM-rerank over markdown (the engine this vault runs); a direct point of comparison to GBrain’s hybrid retriever — minus the typed graph.
Try It
- Study the structure, not just the repo: markdown source-of-truth + flat skill files + a resolver, instead of a chain or planner. The bet is auditability — instructions you can read and version.
- Add a regex entity-extraction pass on write to your own vault to build a typed knowledge graph alongside existing vectors, then rank exact typed hits above semantic ones. ^[inferred application]
- Audit your agent stack for “deterministic work pretending to be judgment work.” Move queued/shell steps to deterministic background jobs (the Minion pattern) before reaching for another model.
- Run an overnight “dream cycle” over your knowledge base — lint, backlink, synthesize reflections, re-embed stale vectors, re-link orphans — so the store appreciates rather than ages.
- Builders: the company-brain layer (multi-tenancy, ACLs, write-conflict resolution, team consensus) is YC’s explicitly named, unsolved RFS opportunity.
- Compare to what this vault already runs: qmd-hybrid-search gives you hybrid retrieval today; GBrain’s distinctive add is the self-wiring typed graph on top.
Open Questions
- Exact GitHub URL / owner-path for GBrain and GBrain-evals — not stated in either transcript.
- License is unconfirmed (Tan’s gstack is MIT; GBrain’s license isn’t stated).
- “PGLite” naming — whether this is the existing PGlite (WASM Postgres) project or a GBrain-specific variant is unclear from the source.
- Independent verification — all benchmark numbers (49.1% P@5, +31.4/+38 pts, ~98% recall@5) are Tan’s own, relayed via third-party explainers; not independently reproduced.
- Reconciliation with this wiki’s stance — the Karpathy LLM-wiki pattern historically favored grep + index files over vector DBs; GBrain argues the next step is a typed knowledge graph on top of vectors. Worth tracking as the pattern evolves. ^[inferred]