Cost & Intelligence Levers for Agent Workflows
Source: wiki synthesis: Advisor Strategy, Extended Thinking, Opus 4.7 Best Practices, Claude Prompting Best Practices, Hermes Productivity Workflows, SEO Content, Marketing Automation Use Cases
Three independent levers shape the cost/intelligence profile of any Claude-driven agent workflow. Most teams treat “more intelligence” as one dial — pick a bigger model, pay more. That misses the real optimization surface. The three levers are Effort, Adaptive Thinking, and Advisor — they compose multiplicatively, and the right combination depends on the workload shape, not the prestige of the model name.
The three levers
Lever 1 — Effort (per-call intelligence budget)
The effort parameter (see Opus 4.7 Best Practices) trades capability for speed and cost at each call:
| Level | Use when |
|---|---|
max | Genuinely hard problems; diminishing returns; prone to overthinking |
xhigh (new default) | Most coding and agentic uses — Anthropic’s recommended starting point |
high | Balance intelligence and cost; minimum for intelligence-sensitive work |
medium | Cost-sensitive with intelligence tradeoff |
low | Short scoped tasks, latency-sensitive |
Opus 4.7 respects effort strictly — low and medium now scope work to what was explicitly asked. Under-thinking risk at low for moderately complex tasks, so raise effort rather than prompt around it.
Lever 2 — Adaptive Thinking (per-step reasoning depth)
Adaptive thinking (thinking: {type: "adaptive"}) lets the model decide at each step whether to think and how much (see Extended Thinking for the full reference). On Opus 4.7 it’s the only supported mode — manual budget_tokens returns 400.
Adaptive is the right default for:
- Autonomous multi-step agents
- Long-horizon coding sessions
- Bimodal workloads (mix of easy + hard tasks)
Display modes compound: display: "omitted" reduces streaming latency without reducing cost. Guidance: pick adaptive + explicit effort + display mode suited to your UX surface.
Lever 3 — Advisor (on-demand intelligence boost)
The advisor_20260301 tool (see Advisor Strategy) inverts the typical sub-agent pattern. A cheap executor (Sonnet, Haiku) drives the loop and consults Opus upward only when stuck. One API call, max_uses budget, advisor never calls tools or produces user output.
Result: Anthropic measured Sonnet + Opus advisor at +2.7pp on SWE-bench with 11.9% lower cost than Sonnet solo. Haiku + Opus advisor doubles Haiku’s BrowseComp score at 85% less than Sonnet solo.
The decision surface
These levers compose independently — any combination is valid:
| Workload | Effort | Thinking | Advisor | Why |
|---|---|---|---|---|
| High-volume tagging / extraction | low / medium | adaptive | Haiku + Opus advisor (max_uses=1) | Cheap base + rare deep reasoning |
| Agentic coding, hard problems | xhigh | adaptive | Sonnet + Opus advisor (max_uses=3) | Quality priority, cost still controlled |
| Long-horizon autonomous work | high | adaptive | Opus solo | Don’t add complexity; effort handles it |
| Short lookups, simple responses | low | adaptive (rarely triggers) | none | Minimum cost, nothing to escalate |
| Unpredictable complexity | xhigh | adaptive | Sonnet + Opus advisor | Let the executor decide when to escalate |
Applying to existing workflows in this wiki
Hermes Agent workflows (productivity-workflows, marketing-applications)
Hermes’ self-improving agent runs many cheap decisions and occasional hard ones. Classic advisor-pattern fit: Haiku executor + Opus advisor with max_uses=1 or 2. Hermes’ learning-loop memory does its own adaptive-reasoning layer on top — advisor tool composes cleanly.
SEO content pipeline (seo-content topic)
GSC Autonomous SEO engine, Blog-Agent-Worker pipeline, Clawdbot competitive intel — all are fan-out workflows where 90% of decisions are routine and 10% are architectural. Sonnet with Opus advisor at max_uses=2-3 is the sweet spot. Run at xhigh effort during content generation passes.
Marketing automation (marketing-automation-use-cases)
Per-campaign workflows at scale should run at medium effort on Haiku or Sonnet, with advisor gated on complex decisions only. The marketing workloads profile matches “high-volume tagging / extraction” row above — Haiku + Opus advisor at low-medium effort.
Agent workflow patterns (agent-workflow-patterns)
The three workflow shapes (Sequential, Parallel, Evaluator-Optimizer) sit orthogonal to these three levers. Choose the workflow shape first, then tune the levers inside each stage. Evaluator-Optimizer loops specifically benefit from asymmetric lever settings — cheap executor for generation, heavy advisor for evaluation.
Key Takeaways
- Three levers, not one dial. Effort (per-call budget), Adaptive Thinking (per-step depth), Advisor (on-demand escalation). They compose multiplicatively.
- Opus 4.7 collapses two levers together. No manual thinking budgets — you get adaptive thinking + effort together. Simpler, but means
budget_tokens-based harnesses break. - Advisor inverts the cost curve. Traditional “get better model” pays full rate on every call. Advisor pays cheap rate on the loop, expensive rate only on escalation.
- High-volume workloads favor Haiku + advisor. 85% cheaper than Sonnet solo, 2× Haiku’s solo benchmark. This is the pattern for tagging, routing, extraction, summarization at scale.
- Interactive coding favors
xhigh+ Sonnet executor + Opus advisor. Quality on the hard problems, cost control on the rest. - Workflow shape precedes lever tuning. Evaluator-Optimizer decisions come first; lever settings tune each stage within them.
Related
- The Advisor Strategy (advisor_20260301)
- Extended Thinking (API Reference)
- Opus 4.7 Best Practices for Claude Code
- Claude Prompting Best Practices
- Agent Workflow Patterns
- Claude Managed Agents — pricing structure ($0.08/session-hour + token rates) is a fourth dimension to consider
- Hermes Productivity Workflows
- SEO Content topic index
- Marketing Automation Use Cases
- All connection articles
Try It
- Characterize one workload. Pick a real Claude-driven flow (Hermes task, SEO pipeline stage, marketing campaign). Classify it in the decision-surface table above. Confirm the lever combination you’re using matches.
- A/B one lever at a time. Change only effort on a representative 10-task sample. Measure cost, latency, quality. Repeat for thinking config. Repeat for advisor. Singular changes teach you the shape of each lever; compound changes obscure it.
- Move a high-volume workflow to Haiku + advisor. If you’re running Sonnet on anything that tagged, routed, extracted, or summarized, try Haiku + Opus advisor with
max_uses=1. Per Anthropic’s own numbers it should be 85% cheaper at double the quality of Haiku alone. - Add advisor to an existing agentic loop. Take one Claude Code skill or routine you run frequently. Add
advisor_20260301withmax_uses=3. Measure whether “stuck” cases now resolve without human intervention. - Read the three source articles in order: Opus 4.7 → Extended Thinking → Advisor Strategy. Builds the model from the cheapest lever (effort) to the most complex (advisor).