Source: raw/Hermes_Agent_Update_v0.15_is_POWERFUL_Velocity_Release.md — YouTube creator walkthrough (GL2FhteoPBA, Ron/Shawn). Figures below are creator-reported from the release notes, not pulled from Nous Research’s official changelog — verify load-bearing numbers against hermes-agent.nousresearch.com/docs before quoting in production. Post-v0.15 deltas (see ## Post-v0.15 increments) compiled from x-account-teknium-2060839436330655775 (~14% read-file token savings), x-account-nousresearch-2060722721516872020 (Step 3.7 Flash + Nous Portal), and x-bookmarks-hermes-digest-2026-05-31 (Kanban+/goal, Qwen 3.7 Max, Krea, session-storage disk savings), plus raw/x-bookmarks-recent-digest-2026-06-04.md (Web Dashboard overhaul, hermes -c directory fix).

Hermes Agent v0.15 (“the Velocity Release”) bundles a wave of architecture, speed, orchestration, integration, and security changes that had been rolling out individually over the prior two weeks into one official release, plus a same-day hotfix. The through-line: Hermes “feels less like one giant script that kept growing and more like a real functioning system” — a core refactor that makes updates safe and the agent measurably faster on both startup and mid-conversation.

Key Takeaways

  • Core refactor — the headline. The main loop (run_agent.py) went from 16,000 → 3,800 lines, redistributed across 14 focused modules (it had already been halved to ~8,000 in the speed-upgrade ~8 days prior). User-facing behavior stays compatible; the payoff is that future hermes update runs are far less likely to break things — explicitly framed as fixing the failure mode where OpenClaw updates “almost always break something” because its main loop was a monolith.
  • Startup + per-turn speed. hermes --version dropped 700ms → 258ms; per-turn framework overhead fell from 399K → 213K function calls, so the agent spends less effort on its own machinery and more on your task. Reported to feel “more deterministic” at finishing simple tasks without needing /goal.
  • Session search rebuilt without an auxiliary LLM: ~90s → ~20ms (the creator’s “4,500× faster”), which also removes the extra model cost. Paired with full verbatim session resumehermes sessions list → take an ID → resume → see the prior conversation in full (previously hidden), so you no longer dig through noisy logs.
  • Kanban Swarm (hermes kanban swarm) — the standout feature. Builds a task graph from a root task with parallel workers + a verifier + a synthesizer coordinating around it. Operational hardening makes it viable as a coding harness, not just a general one: worktree-per-task (each job gets its own branch/working path — explore multiple fixes in parallel without collisions), claim TTL (a task doesn’t stay stuck if a worker hangs), retry fingerprinting (avoid repeating the same failure loop), and stale-task detection with respawn guards.
  • Security hardening as Hermes gets more capable. A “brainworm”-class prompt-ware attack (hijacks the agent via tool output, recalled memory, or stored skills) is now blocked at three choke points — tool output, recalled memory, stored skills — plus 15 new threat patterns and a security-guidance plugin. The more connected the agent, the more dangerous untrusted inputs become. Pairs with the Hermes Security Model.
  • Integrations: Notify added as the 23rd messaging platform (the creator’s own take: stick to TUI/CLI, use OpenClaw if you want chat platforms); Krea image-gen (the creator’s auto-captioned video rendered this as “Korea”; Nous Research’s own announcement confirms it is Krea — added as an image-generation API provider exposing Krea 2, a from-scratch foundation model with style transfer, moodboard input, and adjustable creativity; needs a Krea API key, not covered by the Nous Portal plan ^[inferred]); Bitwarden secret management — one bootstrap token pulls provider keys cleanly instead of a sprawling .env, for easier rotation and better hygiene; the new MCP Catalog (one-click install of Nous-approved MCP servers); and Skill Bundles (shipped earlier in the v0.14.x line).
  • TUI session orchestrator — switch between live sessions without leaving the interface.
  • Same-day hotfix: fixed a dashboard infinite-reload loop in loopback/Docker mode, patched Kanban worker behavior, restored MCP bear command resolution in Docker, and brought back the full ~19,000-entry skills catalog.

Post-v0.15 increments

Deltas that landed on main after the v0.15 cut, surfaced from Nous Research / @Teknium posts and X-bookmark digests (2026-05-31, 2026-06-04). All hermes update-gated; creator-reported figures flagged as such.

  • Kanban jobs now work with /goal (@Teknium, 2026-05-31) — Kanban-board jobs can be driven through the /goal autonomy command, joining the two flagship orchestration surfaces. Previously /goal and the Kanban were separate paths. hermes update to access. ^[inferred]
  • ~14% input-token savings on read-file operations (@Teknium, 2026-05-30) — attributed to removing line-number gutter / padding overhead from file reads, so every tool-use file read costs fewer input tokens. Pairs with the startup-overhead win above (the wrapper, not the model, getting optimized). Falsifiable creator-reported number; now on main.
  • Session-storage rework — ~20-40% disk savings (@Teknium / @yoniebans, 2026-05-21) — a re-architecture of how sessions are stored and accessed; reported to cut Hermes’ on-disk footprint by roughly 20-40%, speed up session loading, and simplify the codebase. Creator-reported range; shipped to main ahead of the next major release.
  • Qwen 3.7 Max supported (Nous Research, 2026-05-26) — added as a selectable model in Hermes Agent.
  • Step 3.7 Flash — free 30 days via Nous Portal (Nous Research, 2026-05-30) — StepFun’s new 198B sparse-MoE vision-language model (~11B active params, 256K context, ~400 TPS, Apache-2.0 weights), pitched for agent efficiency, coding, search, and multimodal (UI/charts/docs) with reliable tool use. Cited as top on ClawEval, SimpleVQA Search, SWE-PRO, V* Python, and τ²-bench, and explicitly compatible with Hermes Agent (plus Claude Code, KiloCode, OpenClaw, MCP). Verify the StepFun benchmark figures before asserting as fact. ^[inferred]
  • Nous Portal (Nous Research, 2026-05-30) — a one-stop subscription for Hermes Agent builders: 300+ models plus bundled tools (web search, scraping, image generation, browser, code execution, voice), with exclusive discounts and free tiers. The single-subscription access layer the Krea/Step-3.7 free-tier offers route through. ^[inferred]
  • NVIDIA RTX Spark / DGX Spark + OpenShell — the local-agent hardware tier (NVIDIA GTC Taipei @ COMPUTEX, 2026-06-01). NVIDIA unveiled RTX Spark, a new class of Windows PCs purpose-built for personal agents (up to 1 petaflop AI compute, 128GB unified memory), and NVIDIA OpenShell, an agent policy/runtime layer — define what agents can and cannot do, route queries to local models by the user’s privacy policy, and disguise personal information in cloud-bound queries. Hermes Agent and OpenClaw are the named adopters in their new Windows apps; Nous CEO Dillon Rolnick: “RTX Spark and NVIDIA OpenShell give Hermes users a powerful and secure environment for agents to run and work alongside you.” Hermes ships LM Studio + Ollama support out of the box and runs alongside Qwen 3.6 (27B/35B) via llama.cpp/LM Studio/Ollama. NVIDIA cites Hermes crossing 140K GitHub stars in under three months and being “the most used agent in the world according to OpenRouter.” The “Build It Yourself” agentic-AI series pairs OpenShell with NemoClaw. Sources: NVIDIA blogs + newsroom (ai-research/hermes-nvidia-rtx-spark-computex-2026.md) + @NousResearch status/2061323987804713083.
  • Web Dashboard overhaul — feature-complete admin panel (2026-06-03). The Hermes Web Dashboard now surfaces “a complete management plane” run entirely from the browser — sessions, MCP servers, webhooks, and system health — with the stated goal of reducing or eliminating the need to run CLI commands directly (@NousResearch + @Teknium announcement posts, 2026-06-03). The browser-admin counterpart to the official Hermes Desktop app’s no-terminal pitch.
  • Session resume returns to the original directory (2026-06-03). Resuming a session — including hermes -c for the most recent one — now relaunches it in the directory it was originally launched from (@Teknium).

Why it matters

This is a maintainability-and-speed release more than a feature release, and the refactor is the load-bearing change: a 16K → 3.8K-line main loop across 14 modules is what makes the “installs and runs anywhere, updates without breaking” promise of the v0.14 Foundation Release actually hold. The Kanban Swarm is the most consequential new capability — root-task → parallel-workers → verifier → synthesizer with worktree-per-task is the structural answer to multifaceted jobs (research → deck → site) that “rarely stay one clean prompt,” and it pushes Hermes toward being a credible coding harness. The session-search rebuild (LLM → 20ms) is the kind of unglamorous infra win that changes whether a feature actually gets used.

Try It

  1. hermes update to v0.15, then time hermes --version and a couple of simple turns — the startup/overhead drop is the most immediately felt change.
  2. Try hermes kanban swarm on a genuinely multi-step job (e.g. “research X → summarize into a deck → build a landing page”) and watch the worker/verifier/synthesizer split + worktree-per-task in action.
  3. Resume an old session via hermes sessions list → pick an ID → resume, and confirm the full prior conversation now renders (no more log-spelunking).
  4. If you manage many provider keys, wire up Bitwarden secret management and rotate via the bootstrap token instead of editing .env.

Open Questions

  • All performance figures (16K→3.8K lines, 700→258ms, 399K→213K calls, 90s→20ms session search) are creator-reported from the release notes — confirm against the official Nous Research changelog / GitHub release for v0.15.
  • The Kanban Swarm worktree-per-task + verifier/synthesizer design warrants its own deep-dive (the creator promised a dedicated video) — how conflict resolution and synthesizer merging behave under real multi-file coding load is unverified here.