Source: raw/gh-star-rohitg00-agentmemory.md (gh-stars puller stub) + ai-research/rohitg00-agentmemory-readme-2026-05-30.md (WebFetch-derived README summary, fetched 2026-05-30) + raw/The_5-Tool_Fix_for_Claude_Code_s_Worst_Habits.md (operator demo, video hqcZZuvBUSY)
agentmemory (github.com/rohitg00/agentmemory, Apache-2.0, TypeScript, ~19.8K stars) is a persistent-memory server for AI coding agents. It listens to your agent sessions in the background, compresses what happened into searchable memory, and injects the relevant pieces back into future sessions — so you stop re-explaining your project’s architecture, conventions, and past decisions every time you start fresh. It clears the wiki’s repo bar on the substance that usually trips up high-star projects: Apache-2.0 license, 950+ passing tests, a reproducible benchmark harness, and 46 releases of maintenance history.
Key Takeaways
- What it replaces: the
CLAUDE.md+rules/folder pattern, but as a living layer. Instead of hand-maintaining static memory files, agentmemory accumulates and decays memory automatically from session activity. - Four memory tiers (the same shape Hermes and OpenClaw use, productized as a drop-in server): working (raw tool-use observations), episodic (compressed session summaries), semantic (extracted facts/patterns about your project), procedural (workflows + decision patterns). Memories decay — infrequently-used ones lose retrieval priority, so stale context (e.g. a framework quirk fixed in a newer model) auto-evicts.
- Retrieval is hybrid, not just vector. Three indexes — BM25 keyword, vector similarity, and knowledge-graph traversal — fused via Reciprocal Rank Fusion, then the top results are injected at
SessionStart. This is the same RRF + multi-stream pattern the wiki’s stronger memory toolkits converge on. - Pipeline: Capture (hooks intercept tool events, filter secrets) → Compress (LLM summarization, or synthetic BM25 compression with no LLM) → Index (3 streams) → Retrieve (RRF + SessionStart injection).
- Benchmarks (project’s own, reproducible): 95.2% R@5 on LongMemEval-S (ICLR 2025, 500 questions) vs grep’s 86.2%; ~170K tokens/year vs ~650K for LLM-summarized competitors; 2.2× precision over grep; 14 ms p50. Harness in
eval/README.md, scorecards indocs/benchmarks/. ^[inferred — the “#1 persistent memory” claim rests on selected metrics; competitors lead on some dimensions, per the repo’s own honesty note] - No infra tax. Built on an in-house
iiifunction-trigger-worker primitive that replaces Express/SQLite/pm2/Prometheus; only dependency is SQLite + the iii engine. Deployment templates for Fly.io, Railway, Render, Coolify. - Works across 15+ agents via MCP, native plugins, or REST API — Claude Code, Codex, Cursor, Copilot among the named topics. Real-time viewer at
localhost:3113; server on3111.
Why it matters
The wiki has tracked the “memory is the agent bottleneck” thesis from several angles — Memory & Dreaming, Managed Agents memory stores, and the build-your-own memory architecture comparison. agentmemory is the off-the-shelf, benchmarked, multi-agent answer in that space: you don’t design a memory stack, you npm install one and connect it. It is architecturally a sibling to Reflexio and AutoAgent (harness extracts a reusable artifact from past runs) and to Hermes MemoryKit (RRF router + tiered memory) — but unlike those early/solo repos, it ships with a peer-reviewed benchmark and 950+ tests, which is why it clears the strict repo bar where similar high-star projects get deferred.
Schema discipline — the structure counterpoint (Graphiti / Zep)
agentmemory’s lever is decay — solving what to forget. A complementary school argues the harder problem is what to never store, and that the fix is structure, not retrieval. ^[source: @akshay_pachaar thread, raw/x-bookmarks-recent-digest-2026-05-31.md, promoting Zep AI’s open-source Graphiti] The failure mode it targets: the default pipeline hands an LLM raw text and lets it pick entity types, labels, and relationships on its own — so the knowledge graph “behaves like an expensive vector store,” entity types collapse into generic labels, and every relationship flattens into a single RELATES_TO. The graph holds the data but no query reaches it with precision.
The proposed fix is to constrain the output space before generation, not after:
- Entities define what the agent is allowed to remember — Pydantic models with typed fields + descriptive docstrings replace the LLM’s guesswork with domain vocabulary.
- Edges define how things connect — source/target constraints mean an invalid relationship (e.g.
Project → Competitorwith no such edge in the schema) simply cannot form. - Temporal resolution separates what was true from what is true — fact resolution invalidates outdated edges while preserving history, so the graph never silently serves stale state. (This is the same hazard agentmemory’s decay addresses, reached from the opposite direction.)
- Heuristic: ~10 entity types, 10 edge types, 10 fields per type — start with 3–4 of each and expand only when retrieval fails. Forces modeling the 80% that matters instead of chasing completeness.
Zep AI’s Graphiti packages this as a fully open-source temporal knowledge-graph library (Pydantic-based ontology, schema-guided extraction at both entity- and fact-extraction points, entity/fact resolution, temporal windowing). It is the named “schema-first” alternative to agentmemory’s “capture-then-decay” approach — worth comparing directly if your memory needs domain-specific query precision rather than general session recall.
Implementation
- Tool/Service:
agentmemory—github.com/rohitg00/agentmemory, npm@agentmemory/agentmemory, homepageagent-memory.dev. - Setup:
npm install -g @agentmemory/agentmemory→ runagentmemory(server on:3111) →agentmemory connect claude-code(wires the MCP server). Viewer atlocalhost:3113. - Cost: free / open-source (Apache-2.0). LLM-compression mode incurs model cost; the synthetic BM25 compression mode runs with no LLM if you want zero marginal cost.
- Integration notes: MCP, native plugin, or REST API; 15+ agents supported. Memory decay is automatic — no manual pruning. Secrets are filtered at the capture hook, but treat the local memory store as sensitive (it mirrors your session content).
Try It
- Install and connect to one project’s Claude Code:
npm i -g @agentmemory/agentmemory,agentmemory, thenagentmemory connect claude-code. Run a normal session and watch thelocalhost:3113viewer populate. - Run the same recurring task across two fresh sessions (e.g. “add a setting + persist it”) and check whether the second session skips the re-discovery work the first did — that’s the value test.
- If you want zero marginal cost, configure the BM25 (no-LLM) compression mode and compare retrieval quality against the LLM mode on your own repo.
- Vet before depending on it: clone the
eval/harness and reproduce a slice of the LongMemEval-S numbers rather than taking the “#1” claim at face value.
Related
- Claude Code Memory Architecture Comparison — the build-your-own framing agentmemory packages into a server.
- Memory & Dreaming — Self-Learning Agents — the memory-as-bottleneck thesis this productizes.
- Auto Memory — Claude Code’s native cross-session memory; agentmemory is the multi-agent, retrieval-heavy alternative.
- Reflexio — sibling “extract a reusable artifact from past runs” harness (playbooks vs memory tiers).
- Hermes MemoryKit — the Hermes-native tiered-memory + RRF-router counterpart.
- Five OSS Tools That Fix Claude Code’s Blind Spots — agentmemory is the persistent-memory tool (#4) in that operator workflow.
Open Questions
- The “#1 persistent memory … based on real-world benchmarks” claim rests on LongMemEval-S R@5 + token-efficiency; independent reproduction (and head-to-heads vs Mem0/Zep/Letta on the same harness) would confirm or temper it.
- The
iiiruntime primitive (replacing Express/SQLite/pm2/Prometheus) is novel and load-bearing — its maturity and failure modes under concurrent multi-agent load are unverified here. - Privacy posture of the local store under team/shared use (the capture hook filters secrets, but the store mirrors session content) — worth a closer look before connecting it to client-data sessions.