Tool use, planning, multi-agent patterns, agent frameworks, and practical agent deployments. Covers both Claude-specific agent features and general agentic architecture patterns.

Articles

  • Claude Agent Hierarchy — When to Use Which — Comparison of Claude’s three agent tiers (Managed Agents, Agent Teams, Subagents) with decision framework for choosing the right one.
  • Agent Workflow Patterns — Sequential, Parallel, Evaluator-Optimizer — Anthropic’s official taxonomy of the three workflow shapes that keep showing up in production, plus the decision framework. Default to sequential. “Start with the simplest pattern that solves your problem.”
  • AI Agents Unleashed — 2026 Playbook (Mindstream × Futurepedia) — Platform-agnostic implementation guide: chatbot-vs-agent reframe, precision framework, “Is this an agent job?” decision tree, 4-phase roadmap, 7 pitfalls, human-AI relationship timeline, 7 training competencies. Authors: Adam Biddlecombe + Kevin Hutson.
  • Nous Research Hermes Agent — Self-hosted autonomous agent with persistent memory, auto-generated skills, 47+ tools, 6 sandbox backends, 15+ messaging platforms, and MCP integration. Model-agnostic (Nous Portal, OpenRouter, OpenAI, or any OAI-compatible endpoint). MIT, 97K+ stars.
  • Adaline — End-to-End AI Agent Platform — Single platform for the four-stage agent lifecycle: iterate, evaluate, deploy, monitor. Provider-agnostic prompt management, multi-modal + dynamic-variable testing, AI-assisted test-suite generation, multi-environment deployments with smart diffing and instant rollbacks, full traces/spans, human-annotation loop tied directly to monitoring. Recently went GA with $1MM API-credit promotion. Customers: McKinsey (Lilli), Discord, Coframe, Reforge. Stats claimed: 200M+ API calls/day, 5B+ tokens/day, 300+ models, 99.998% uptime. Sits alongside LangSmith / PromptLayer / Helicone / Braintrust / Galileo in the LLMOps space.
  • TinyFish — Web Infrastructure APIs for AI Agents — Four-product platform under one API key: Search, Fetch, Browser, Agent. Search + Fetch went free May 4 2026 across REST / MCP / SDKs / CLI / Skill (free-tier 5 q/min Search, 25 URLs/min Fetch). Custom Chromium fleet with 28 C++-level anti-bot mechanisms; sub-250ms browser cold start, P50 488ms search. Vendor-reported 87% token reduction and 2× completion rate when using CLI + Skill over MCP — concrete data on context-window economics. $47M Series A from ICONIQ; customers Google / DoorDash / Cigna / Volkswagen / Grubhub / NEC; integrates with Hermes / OpenClaw / Cline / Goose / Antigravity / n8n / Dify / LangChain / CrewAI. Direct competitors named in launch coverage: Browserbase (uses Exa for search), Firecrawl (agent reliability issues).
  • ScrapeCreators — Social Media Scraping API for AI Pipelines — Adrian Horning’s (Austin, TX) social-scraping API across 20+ platforms (TikTok 20 endpoints, Instagram 12, YouTube 12, Facebook 9 + Ad Library, X/Twitter 6, LinkedIn 4 + Ad Library, Reddit 5, Pinterest 4, Threads 5; plus Bluesky/Truth Social/Twitch/Spotify/Snapchat/Kick + 4 ad libraries + 5 link-in-bio platforms). 100 free credits no-card; pay-as-you-go from 497/500k credits → enterprise. Single x-api-key header, no rate limits, JSON-only, ~3.1s avg response, claimed 1M+ req/day at 98.2% success. Ships official MCP server (@scrape-creators/mcp) + CLI + first-party Claude Code skill. Sister to TinyFish (web infra) — ScrapeCreators is the social-platform-deep counterpart. Karpathy last30days SessionStart hook calls it out by name as the gap-filler for Reddit comments + TikTok + Instagram (note: hook quotes 10k free credits, landing page shows 100 — flagged for refresh).
  • Crabbox — Remote Testbox for OpenClaw Maintainers and AI Agentsgithub.com/openclaw/crabbox (MIT, Go, 299★ at 10 days old, created 2026-04-30, last push same-day as ingest). Short-lived Linux box for every run on shared cloud capacity: lease, sync, run, release. CLI (Go binary on the laptop) + Broker (Cloudflare Worker + 1 Durable Object) + Runner (Hetzner / AWS Spot / Azure / static-SSH / Blacksmith-testbox). Brokered mode keeps provider creds off laptops; CLI carries only a bearer token. Cost guardrails first-class — TTL caps + monthly spend caps + per-user/org/provider tracking via crabbox usage. Ships as standalone CLI (brew install openclaw/tap/crabbox) AND native OpenClaw plugin exposing 5 agent tools (crabbox_run / _warmup / _status / _list / _stop). The OpenClaw answer to the infrastructure-was-the-wall thesis Anthropic’s Platform team articulates — open + self-hosted + multi-cloud counterpart to Anthropic’s Managed Agents. Notable design choice: crabbox actions hydrate reuses existing GitHub Actions setup steps so local Crabbox runs land in the same hydrated workspace as CI (no duplicate local + CI bootstrap config). Same loop for agents and humans.
  • Paperclip — Multi-Agent Company Orchestration Platform — Paperclip frames AI agent management as running an AI company rather than configuring a single coding assistant. Heartbeat system (9-step protocol per agent: receive task → check budget → load skills → plan → execute → log → checkpoint → return → sleep), goal cascade (org-level → team-level → agent-level), full org-chart UI showing reporting structure and inter-agent message volume, five agent configuration areas (Instructions / Configuration / Skills / Budget / Runs), 16 pre-built example “companies” including Agency Agents and Fullstack Forge. Native Claude Code REST API integration on localhost:3100 — Paperclip can dispatch work to a local Claude Code instance instead of going through the Anthropic API directly. Closes the “I want one agent that runs the whole company while I sleep” loop that single-agent frameworks struggle with. AIS+ resource bundle entry; companion course to Codex 1-Hour and Hermes 1-Hour from the same operator.
  • Autobrowse — Self-Improving Browser-Agent Harness (Browserbase) — Browserbase’s harness that runs a browser agent against a real task on a real site, iterates the strategy via a strategy.md scratchpad until the workflow converges, then graduates the winning approach into a markdown SKILL.md plus deterministic helper scripts. Frames the loop as the Karpathy autoresearch ratchet applied to browser-skill discovery. Concrete benchmarks (Browserbase-reported): Craigslist task 0.12/27s graduated; form-fill 0.24 in 4 iterations; federal grants portal collapsed a 28-page scrape into a single browse fetch after Autobrowse surfaced an undocumented JSON endpoint. Cap iterations low (~3-5), short-circuit aggressively. Honest failure mode: deterministic-parsing tasks (167-row static HTML state catalog cost ~$24 across 4 iters before pivoting to 200 lines of Python with browse fetch + BeautifulSoup) — lesson written into the skill itself: probe with fetch first, escalate to Autobrowse only if the response is empty / dynamic / gated. Output is small readable markdown (frontmatter with recommended_method + alternative_methods + source trace listing iters/convergence date/cross-region prod-validation; body sections for Purpose / When to Use / Workflow / Site-Specific Gotchas) — same format Browserbase’s internal generalist agent bb already loads on demand for feature requests / session investigations / PRs / sales triage. Skills as customer handoff — durable, debuggable, human-auditable, ownable; both engineers and non-engineers (technical PM, VP of tech, grants manager) can read them. Same memory-as-bottleneck thesis as Memory & Dreaming and the Platform team interview, applied to browser agents specifically. Roadmap: smarter stopping (let the agent reason about own convergence by trace structure, not just cost/turns), better priors (push the agent toward fetch/search primitives before browser sessions, and inspect network events / CDP logs to discover internal APIs), recursive Autobrowse (improving the harness itself).

Adjacent: long-running agent showcases

  • ClaudePlaysPokemon[Reddit signal — r/ClaudeCode 2026-05-07] Opus 4.7 run currently streaming live at twitch.tv/claudeplayspokemon. Passion project by David Hershey (Anthropic Applied AI team), started June 2024 to learn agent development; went public when Sonnet 3.7 launched February 2025. Anthropic doesn’t own it but promotes it and subsidizes the API costs since Claude is the model. Useful as a publicly-observable benchmark of long-horizon agent capability — what the model does on a single complex environment given multi-day continuous compute. Source: raw/reddit-1t5y55h.md (r/ClaudeCode, 41 upvotes).

10 items under this folder.