Source: Hermes Agent docs — Codex App-Server Runtime (full content cached at ai-research/hermes-codex-app-server-runtime-2026-05-14.md) Publisher: Nous Research (Hermes Agent project) Section: User Guide → Features Captured: 2026-05-14

Opt-in Hermes runtime that delegates openai/* and openai-codex/* turns to OpenAI’s Codex CLI instead of running Hermes’ own tool loop. Hermes keeps the orchestration layer (session DB, memory, cron jobs, slash commands, review forks); Codex executes the turn with its sandbox, built-in toolset, and plugin ecosystem. The headline value prop: “Run OpenAI agent turns against your ChatGPT subscription (no API key required) using the same auth flow Codex CLI uses.” Wired up via JSON-RPC over stdio with a bidirectional MCP callback channel so Codex can still call Hermes-side capabilities (browser automation, vision, image generation, skills, TTS) during the turn.

Key Takeaways

  • Authentication path: ChatGPT subscription, not OpenAI API key. Reuses Codex CLI’s existing OAuth flow against ~/.codex/auth.json. Significant for operators who pay for ChatGPT but don’t want to manage a separate API-key budget. Closes a long-standing cost-management gap for Hermes deployments running OpenAI-side turns.
  • Three independent tool sources on each turn: (1) Codex built-ins (shell, apply_patch, update_plan, view_image, web_search), (2) auto-migrated Codex plugins (Linear, GitHub, Gmail, Google Calendar, Outlook, Canva, etc. — whatever was installed via codex plugin survives the runtime switch), (3) Hermes capabilities exposed via MCP callback (web fetch/extract, browser automation suite, vision analyze, image generate, skill view/list, text-to-speech).
  • Four Hermes-specific tools become unavailable because they require live agent-loop context: delegate_task, memory, session_search, todo. Background review processes work around this by downgrading to the codex_responses runtime for review forks, where agent-loop tools remain accessible.
  • Enable per-session with one slash command: /codex-runtime codex_app_server verifies Codex CLI installation, persists the setting to ~/.hermes/config.yaml, discovers + migrates installed Codex plugins, registers Hermes as an MCP server, and configures sandbox permissions. Effect on next session. Alternatives: /codex-runtime on, /codex-runtime off, /codex-runtime auto (default), /codex-runtime (status check).
  • Workspace-write permission by default. Runtime auto-writes default_permissions = ":workspace" to skip prompt-on-every-write. Three profiles exist: :read-only (prompts every command), :workspace (Hermes default), :danger-no-sandbox (off-by-default, not recommended). The Codex CLI’s familiar “Dangerous Command” prompt surface still applies for command + patch execution with Allow-once / Allow-for-session / Deny.
  • CLI version floor: Codex 0.130.0 or newer. npm i -g @openai/codex then codex login to seed the auth file. Below 0.130 the App-Server interop spec doesn’t apply.
  • Goals (/goal) are fully compatible — each continuation is a fresh Codex turn, matching the [[claude-ai/whats-new-2026-w20|W20 /goal semantics on Claude Code]]. Kanban is supported via MCP callback so workers can hand off and report results. Cron jobs aren’t specifically tested but inherit the same tool-availability rules.
  • Self-improvement loop is preserved. “Memory and skill nudges keep firing exactly as they would otherwise” — Hermes projects Codex items into synthetic messages so review agents see the same shape they expect. The review fork itself downgrades to codex_responses to access the agent-loop tools.
  • Auxiliary tasks also flow through your ChatGPT subscription by default when the runtime is on with the openai-codex provider. Title generation, summarization, etc. To route specific auxiliary tasks to cheaper models (e.g. Gemini 3 Flash via OpenRouter), add per-task auxiliary: overrides in config.yaml.
  • Multi-profile isolation requires explicit CODEX_HOME. By default Hermes uses ~/.codex/ regardless of active profile (matching codex CLI behavior). For per-profile separation set CODEX_HOME=~/.hermes/profiles/<name>/codex and run codex login once with that env var to establish profile-specific auth.
  • Safe config-editing convention. Hermes brackets its managed content in ~/.codex/config.toml with # managed by hermes-agent / # end hermes-agent managed section markers. User content outside the markers persists through Hermes-driven migrations — useful for hand-rolled MCP servers, custom permission overrides, or Codex-specific tweaks the operator wants to keep.
  • Disabling is non-destructive. /codex-runtime auto reverts to default behavior; the Codex managed block stays in config for one-command re-enable.

Architecture

JSON-RPC over stdio between Hermes and the Codex subprocess. Hermes maintains the session database, slash-command interface, memory system, and background review processes; Codex maintains the turn-execution loop, sandbox, and patch tracking. Three independent tool sources merge at the turn level:

Tool sourceExamplesWho exposes
Codex built-insshell, apply_patch, update_plan, view_image, web_searchCodex CLI
Auto-migrated Codex pluginsLinear, GitHub, Gmail, Google Calendar, Outlook, CanvaCodex plugin marketplace (preserved)
Hermes MCP callbackweb_search / web_extract, browser automation, vision_analyze, image_generate, skill_view / skills_list, text_to_speechHermes (acts as MCP server to Codex)

The bidirectional MCP callback channel is the load-bearing piece — without it, Codex wouldn’t have access to Hermes’ richer toolset (browser automation, vision, skills) during a delegated turn.

Environment handling

Hermes preserves the real user HOME so Codex’s shell tool finds .gitconfig, .gh/, .aws/, .npmrc and similar user config files naturally. Codex’s own state is isolated through CODEX_HOME. This split matters for any workflow where Codex needs to invoke gh, aws, git, or npm against the operator’s real credentials.

Known Limitations (per the docs)

  • Hermes and Codex authentication remain separate sessions. Two auth flows, two token stores. Operators need both.
  • Four agent-loop tools unavailable via MCP callback (delegate_task, memory, session_search, todo) — covered by the review fork downgrading to codex_responses.
  • Inline patch previews unavailable when Codex doesn’t track changesets. Patches still apply; the preview surface is the gap.
  • Sub-second cancellation not guaranteed during streaming responses. Operationally relevant for any “hit Ctrl-C and unwind cleanly” expectation.

How it fits the Hermes ecosystem

This runtime is the commercial-economics counterpart to the rest of the Hermes story. Hermes’ default value prop (per user stories and Nate Herk’s course) is stateful business agent on self-hosted infrastructure. The Codex App-Server runtime adds: “and you can use your existing ChatGPT subscription instead of paying for OpenAI API credits.” That changes the unit-economics calculus for any operator running a 24/7 Hermes deployment that includes OpenAI-side turns — moves the marginal-cost-per-turn closer to zero on the ChatGPT subscription side, while preserving Hermes’ orchestration value above the turn.

It also concretizes one half of the CLI > MCP > API tool hierarchy thesis from the operator’s perspective: Hermes itself is the orchestration layer, but for OpenAI-side turns the Codex CLI is now the preferred runtime — leveraging Codex’s specialized sandbox + tool ecosystem rather than re-implementing them in Hermes. Same pattern that Higgsfield’s skills bundle embraced for creative-AI surfaces: vendor-CLI as the operator runtime, orchestration sits one tier up.

Try It

For Hermes operators considering the Codex runtime:

  1. Verify Codex CLI is current. codex --version — must be 0.130.0+. If older, npm i -g @openai/codex to bump.
  2. Authenticate Codex against your ChatGPT subscription. codex login — writes tokens to ~/.codex/auth.json. Confirm this works standalone before wiring into Hermes.
  3. (Optional) Pre-install Codex plugins you want available. codex plugin marketplace add openai-curated then any specific plugins (Linear, Gmail, etc.). These auto-migrate when you flip the Hermes runtime.
  4. Inside a Hermes session: /codex-runtime codex_app_server. The command discovers + migrates plugins, registers Hermes as MCP server to Codex, persists the config, and takes effect on the next session.
  5. Audit ~/.hermes/config.yaml for model.openai_runtime: codex_app_server to confirm. Also inspect ~/.codex/config.toml for the # managed by hermes-agent block — anything outside that block is yours to edit.
  6. Test a small /goal task. Each continuation is a fresh Codex turn under the new runtime — useful baseline for behavior + cost comparison vs the prior runtime.
  7. Override auxiliary tasks if you want title generation or summarization off the ChatGPT subscription. Add an auxiliary: block in ~/.hermes/config.yaml pointing specific tasks at cheaper providers (e.g. OpenRouter + Gemini 3 Flash).
  8. For multi-profile setups: set CODEX_HOME=~/.hermes/profiles/<name>/codex in your shell env per profile and run codex login once per profile. Without this, all Hermes profiles share ~/.codex/.
  9. To revert: /codex-runtime auto. Managed config block remains for one-command re-enable.

Open Questions

  • What’s the actual rate-limit on ChatGPT-subscription-driven Codex turns under heavy Hermes usage? Codex CLI alone gets ChatGPT-Plus / Pro limits; the docs don’t quantify what happens when a long-running Hermes session burns through those limits mid-turn. Worth a community-measurement pass once operators have run multi-day deployments.
  • Cron jobs are “not specifically tested but should work.” Open thread until first community report on cron + Codex runtime in production. Hermes cron jobs are load-bearing for the self-improvement loop, so a regression here would be material.
  • MCP callback channel: token cost? Hermes-side tools fired via MCP callback presumably cost ChatGPT tokens (since the turn runs under the Codex/ChatGPT auth). The docs don’t disaggregate cost between Codex-native tool calls and MCP-callback tool calls. Worth measuring.
  • Plugin upgrade flow. When codex plugin upgrade runs outside a Hermes session, do the upgrades land in the next Hermes-driven Codex turn automatically? Or does Hermes need a /codex-runtime codex_app_server re-run to re-discover?
  • What does the codex_responses runtime look like? The docs reference it as the runtime the review fork downgrades to, but it isn’t documented on the same page. Open thread for a follow-up ingest if Nous Research publishes the spec separately.
  • Auth file conflicts. If Codex CLI is updated and rotates auth-file format, does Hermes recover gracefully or does the operator need to manually re-codex login? Operational hygiene worth surfacing.