Cost & Intelligence Levers for Agent Workflows

Source: wiki synthesis: Advisor Strategy, Extended Thinking, Opus 4.7 Best Practices, Claude Prompting Best Practices, Hermes Productivity Workflows, SEO Content, Marketing Automation Use Cases

Three independent levers shape the cost/intelligence profile of any Claude-driven agent workflow. Most teams treat “more intelligence” as one dial — pick a bigger model, pay more. That misses the real optimization surface. The three levers are Effort, Adaptive Thinking, and Advisor — they compose multiplicatively, and the right combination depends on the workload shape, not the prestige of the model name.

The three levers

Lever 1 — Effort (per-call intelligence budget)

The effort parameter (see Opus 4.7 Best Practices) trades capability for speed and cost at each call:

LevelUse when
maxGenuinely hard problems; diminishing returns; prone to overthinking
xhigh (new default)Most coding and agentic uses — Anthropic’s recommended starting point
highBalance intelligence and cost; minimum for intelligence-sensitive work
mediumCost-sensitive with intelligence tradeoff
lowShort scoped tasks, latency-sensitive

Opus 4.7 respects effort strictly — low and medium now scope work to what was explicitly asked. Under-thinking risk at low for moderately complex tasks, so raise effort rather than prompt around it.

Lever 2 — Adaptive Thinking (per-step reasoning depth)

Adaptive thinking (thinking: {type: "adaptive"}) lets the model decide at each step whether to think and how much (see Extended Thinking for the full reference). On Opus 4.7 it’s the only supported mode — manual budget_tokens returns 400.

Adaptive is the right default for:

  • Autonomous multi-step agents
  • Long-horizon coding sessions
  • Bimodal workloads (mix of easy + hard tasks)

Display modes compound: display: "omitted" reduces streaming latency without reducing cost. Guidance: pick adaptive + explicit effort + display mode suited to your UX surface.

Lever 3 — Advisor (on-demand intelligence boost)

The advisor_20260301 tool (see Advisor Strategy) inverts the typical sub-agent pattern. A cheap executor (Sonnet, Haiku) drives the loop and consults Opus upward only when stuck. One API call, max_uses budget, advisor never calls tools or produces user output.

Result: Anthropic measured Sonnet + Opus advisor at +2.7pp on SWE-bench with 11.9% lower cost than Sonnet solo. Haiku + Opus advisor doubles Haiku’s BrowseComp score at 85% less than Sonnet solo.

The decision surface

These levers compose independently — any combination is valid:

WorkloadEffortThinkingAdvisorWhy
High-volume tagging / extractionlow / mediumadaptiveHaiku + Opus advisor (max_uses=1)Cheap base + rare deep reasoning
Agentic coding, hard problemsxhighadaptiveSonnet + Opus advisor (max_uses=3)Quality priority, cost still controlled
Long-horizon autonomous workhighadaptiveOpus soloDon’t add complexity; effort handles it
Short lookups, simple responseslowadaptive (rarely triggers)noneMinimum cost, nothing to escalate
Unpredictable complexityxhighadaptiveSonnet + Opus advisorLet the executor decide when to escalate

Applying to existing workflows in this wiki

Hermes Agent workflows (productivity-workflows, marketing-applications)

Hermes’ self-improving agent runs many cheap decisions and occasional hard ones. Classic advisor-pattern fit: Haiku executor + Opus advisor with max_uses=1 or 2. Hermes’ learning-loop memory does its own adaptive-reasoning layer on top — advisor tool composes cleanly.

SEO content pipeline (seo-content topic)

GSC Autonomous SEO engine, Blog-Agent-Worker pipeline, Clawdbot competitive intel — all are fan-out workflows where 90% of decisions are routine and 10% are architectural. Sonnet with Opus advisor at max_uses=2-3 is the sweet spot. Run at xhigh effort during content generation passes.

Marketing automation (marketing-automation-use-cases)

Per-campaign workflows at scale should run at medium effort on Haiku or Sonnet, with advisor gated on complex decisions only. The marketing workloads profile matches “high-volume tagging / extraction” row above — Haiku + Opus advisor at low-medium effort.

Agent workflow patterns (agent-workflow-patterns)

The three workflow shapes (Sequential, Parallel, Evaluator-Optimizer) sit orthogonal to these three levers. Choose the workflow shape first, then tune the levers inside each stage. Evaluator-Optimizer loops specifically benefit from asymmetric lever settings — cheap executor for generation, heavy advisor for evaluation.

Key Takeaways

  • Three levers, not one dial. Effort (per-call budget), Adaptive Thinking (per-step depth), Advisor (on-demand escalation). They compose multiplicatively.
  • Opus 4.7 collapses two levers together. No manual thinking budgets — you get adaptive thinking + effort together. Simpler, but means budget_tokens-based harnesses break.
  • Advisor inverts the cost curve. Traditional “get better model” pays full rate on every call. Advisor pays cheap rate on the loop, expensive rate only on escalation.
  • High-volume workloads favor Haiku + advisor. 85% cheaper than Sonnet solo, 2× Haiku’s solo benchmark. This is the pattern for tagging, routing, extraction, summarization at scale.
  • Interactive coding favors xhigh + Sonnet executor + Opus advisor. Quality on the hard problems, cost control on the rest.
  • Workflow shape precedes lever tuning. Evaluator-Optimizer decisions come first; lever settings tune each stage within them.

Try It

  1. Characterize one workload. Pick a real Claude-driven flow (Hermes task, SEO pipeline stage, marketing campaign). Classify it in the decision-surface table above. Confirm the lever combination you’re using matches.
  2. A/B one lever at a time. Change only effort on a representative 10-task sample. Measure cost, latency, quality. Repeat for thinking config. Repeat for advisor. Singular changes teach you the shape of each lever; compound changes obscure it.
  3. Move a high-volume workflow to Haiku + advisor. If you’re running Sonnet on anything that tagged, routed, extracted, or summarized, try Haiku + Opus advisor with max_uses=1. Per Anthropic’s own numbers it should be 85% cheaper at double the quality of Haiku alone.
  4. Add advisor to an existing agentic loop. Take one Claude Code skill or routine you run frequently. Add advisor_20260301 with max_uses=3. Measure whether “stuck” cases now resolve without human intervention.
  5. Read the three source articles in order: Opus 4.7Extended ThinkingAdvisor Strategy. Builds the model from the cheapest lever (effort) to the most complex (advisor).