CCA-F Official Exam Guide

Source: raw/instructor-8lsy243ftffjjy1cx9lm3o2bw-public-1773274827-Claude+Certified+Architect+–+Foundations+Certification+Exam+Guide.pdf, raw/guide_en.MD

The official Anthropic exam guide for the Claude Certified Architect — Foundations (CCA-F) certification, marked “Confidential Need to Know.” This document is the definitive reference for exam content: it lists every domain, every task statement, all knowledge and skill items, the 6 exam scenarios, scoring mechanics, and sample questions. If you study one document, study this one.

Target Candidate

The ideal candidate is a solution architect who designs and implements production applications with Claude. Expected experience:

6+ months hands-on with Claude APIs, Agent SDK, Claude Code, and MCP
Building agentic applications with multi-agent orchestration, subagent delegation, tool integration, and lifecycle hooks
Configuring Claude Code for team workflows (CLAUDE.md, Agent Skills, MCP servers, plan mode)
Designing MCP tool and resource interfaces for backend system integration
Engineering prompts that produce reliable structured output (JSON schemas, few-shot, extraction patterns)
Managing context windows across long documents, multi-turn conversations, and multi-agent handoffs
Integrating Claude into CI/CD pipelines for automated code review, test generation, and PR feedback
Making escalation and reliability decisions (error handling, human-in-the-loop, self-evaluation)

Exam Format

60 multiple choice questions — one correct answer, three distractors
Pass/fail with a scaled score of 100 to 1000
Passing score: 720
No penalty for guessing — unanswered questions scored as incorrect
Scaled scoring equalizes difficulty across different exam forms
4 of 6 scenarios randomly selected per exam sitting

The Six Exam Scenarios

Each scenario presents a realistic production context. Questions are framed within these scenarios.

Scenario 1: Customer Support Resolution Agent

Building a customer support agent using the Claude Agent SDK
Handles returns, billing disputes, account issues via MCP tools (get_customer, lookup_order, process_refund, escalate_to_human)
Target: 80%+ first-contact resolution while knowing when to escalate
Primary domains: Agentic Architecture, Tool Design & MCP, Context Management & Reliability

Scenario 2: Code Generation with Claude Code

Using Claude Code for code generation, refactoring, debugging, documentation
Custom slash commands, CLAUDE.md configurations, plan mode vs direct execution
Primary domains: Claude Code Configuration & Workflows, Context Management & Reliability

Scenario 3: Multi-Agent Research System

Coordinator agent delegates to specialized subagents: web search, document analysis, report synthesis
Produces comprehensive, cited reports
Primary domains: Agentic Architecture, Tool Design & MCP, Context Management & Reliability

Scenario 4: Developer Productivity with Claude

Agent helps engineers explore unfamiliar codebases, understand legacy systems, generate boilerplate
Uses built-in tools (Read, Write, Bash, Grep, Glob) and MCP servers
Primary domains: Tool Design & MCP, Claude Code Configuration, Agentic Architecture

Scenario 5: Claude Code for CI/CD

Claude Code integrated into CI/CD pipeline for automated code reviews, test generation, PR feedback
Design prompts for actionable feedback with minimal false positives
Primary domains: Claude Code Configuration, Prompt Engineering & Structured Output

Scenario 6: Structured Data Extraction

Extracts information from unstructured documents, validates with JSON schemas, maintains high accuracy
Must handle edge cases gracefully and integrate with downstream systems
Primary domains: Prompt Engineering & Structured Output, Context Management & Reliability

Domain 1: Agentic Architecture & Orchestration (27%)

The highest-weighted domain. Tests how to design, implement, and manage autonomous agent systems.

Task 1.1: Design and implement agentic loops for autonomous task execution

Knowledge of:

The agentic loop lifecycle: send request, inspect stop_reason ("tool_use" vs "end_turn"), execute tools, return results for next iteration
How tool results are appended to conversation history so the model reasons about next action
Model-driven decision-making (Claude reasons about which tool to call) vs pre-configured decision trees
Anti-patterns: parsing natural language to determine loop termination, arbitrary iteration caps as primary stopping mechanism, checking assistant text for completion indicators

Skills in:

Implementing control flow that continues when stop_reason is "tool_use" and terminates on "end_turn"
Adding tool results to conversation context between iterations
Avoiding anti-patterns like natural language parsing for termination

Task 1.2: Orchestrate multi-agent systems with coordinator-subagent patterns

Knowledge of:

Hub-and-spoke architecture: coordinator manages all inter-subagent communication, error handling, information routing
Subagents operate with isolated context — they do NOT inherit the coordinator’s conversation history automatically
Coordinator role: task decomposition, delegation, result aggregation, deciding which subagents to invoke based on query complexity
Risks of overly narrow task decomposition leading to incomplete coverage of broad research topics

Skills in:

Designing coordinators that dynamically select subagents rather than always routing through the full pipeline
Partitioning research scope across subagents to minimize duplication
Implementing iterative refinement loops (coordinator evaluates synthesis, re-delegates for gaps, re-invokes until sufficient)
Routing all communication through coordinator for observability and consistent error handling

Task 1.3: Configure subagent invocation, context passing, and spawning

Knowledge of:

The Task tool for spawning subagents; allowedTools must include "Task" for coordinators
Subagent context must be explicitly provided in the prompt — subagents do NOT automatically inherit parent context or share memory
AgentDefinition configuration: descriptions, system prompts, tool restrictions per subagent type
fork_session for exploring divergent approaches from a shared analysis baseline

Skills in:

Including complete findings from prior agents in the subagent’s prompt (e.g., passing web search results to synthesis subagent)
Using structured data formats to separate content from metadata (source URLs, page numbers) for attribution
Spawning parallel subagents by emitting multiple Task calls in a single coordinator response
Designing coordinator prompts with research goals and quality criteria, NOT step-by-step procedural instructions

Task 1.4: Implement multi-step workflows with enforcement and handoff patterns

Knowledge of:

Programmatic enforcement (hooks, prerequisite gates) vs prompt-based guidance for workflow ordering
When deterministic compliance is required (e.g., identity verification before financial operations), prompts alone have a non-zero failure rate
Structured handoff protocols: customer ID, root cause analysis, recommended actions for escalation

Skills in:

Implementing programmatic prerequisites that block downstream tool calls until prior steps complete (e.g., blocking process_refund until get_customer returns verified ID)
Decomposing multi-concern customer requests into distinct items, investigating each in parallel, synthesizing unified resolution
Compiling structured handoff summaries when escalating to humans

Task 1.5: Apply Agent SDK hooks for tool call interception and data normalization

Knowledge of:

PostToolUse hooks that intercept tool results for transformation before the model processes them
Hooks that intercept outgoing tool calls to enforce compliance (e.g., blocking refunds above a threshold)
Hooks provide deterministic guarantees vs prompt instructions which are probabilistic

Skills in:

Implementing PostToolUse hooks to normalize heterogeneous data formats (Unix timestamps, ISO 8601, numeric status codes) from different MCP tools
Implementing interception hooks that block policy-violating actions and redirect to alternative workflows (e.g., human escalation)
Choosing hooks over prompt-based enforcement when business rules require guaranteed compliance

Task 1.6: Design task decomposition strategies for complex workflows

Knowledge of:

Fixed sequential pipelines (prompt chaining) vs dynamic adaptive decomposition based on intermediate findings
Prompt chaining for breaking reviews into sequential steps (analyze each file individually, then cross-file integration pass)
Adaptive investigation plans that generate subtasks based on what is discovered at each step

Skills in:

Selecting prompt chaining for predictable multi-aspect reviews, dynamic decomposition for open-ended investigation
Splitting large code reviews into per-file local analysis passes plus a separate cross-file integration pass
Decomposing open-ended tasks by first mapping structure, identifying high-impact areas, then creating a prioritized plan that adapts

Task 1.7: Manage session state, resumption, and forking

Knowledge of:

--resume <session-name> to continue a specific prior conversation
fork_session for creating independent branches from a shared analysis baseline
Informing the agent about changes to previously analyzed files when resuming sessions
Starting fresh with structured summaries is more reliable than resuming with stale tool results

Skills in:

Using --resume with session names to continue named investigation sessions
Using fork_session to create parallel exploration branches (comparing testing strategies or refactoring approaches)
Choosing between resumption (prior context mostly valid) and fresh start with injected summaries (prior tool results stale)
Informing resumed sessions about specific file changes for targeted re-analysis

Domain 2: Tool Design & MCP Integration (18%)

Task 2.1: Design effective tool interfaces with clear descriptions and boundaries

Knowledge of:

Tool descriptions are the primary mechanism LLMs use for tool selection; minimal descriptions lead to unreliable selection
Including input formats, example queries, edge cases, and boundary explanations in descriptions
Ambiguous or overlapping descriptions (e.g., analyze_content vs analyze_document with near-identical descriptions) cause misrouting
System prompt wording can create unintended keyword-sensitive tool associations

Skills in:

Writing descriptions that clearly differentiate each tool’s purpose, expected inputs/outputs, and when to use it vs alternatives
Renaming tools and updating descriptions to eliminate functional overlap
Splitting generic tools into purpose-specific tools with defined I/O contracts
Reviewing system prompts for keyword-sensitive instructions that override well-written tool descriptions

Task 2.2: Implement structured error responses for MCP tools

Knowledge of:

MCP isError flag pattern for communicating tool failures to the agent
Distinction between transient errors (timeouts), validation errors (invalid input), business errors (policy violations), and permission errors
Uniform error responses (generic “Operation failed”) prevent appropriate recovery decisions
Retryable vs non-retryable errors — structured metadata prevents wasted retry attempts

Skills in:

Returning structured error metadata: errorCategory (transient/validation/permission), isRetryable boolean, human-readable descriptions
Including retriable: false and customer-friendly explanations for business rule violations
Local error recovery in subagents for transient failures; propagating only unresolvable errors to coordinator
Distinguishing between access failures (needing retry) and valid empty results (no matches)

Task 2.3: Distribute tools appropriately across agents and configure tool choice

Knowledge of:

Too many tools (e.g., 18 instead of 4-5) degrades tool selection reliability by increasing decision complexity
Agents with tools outside their specialization tend to misuse them (e.g., synthesis agent attempting web searches)
Scoped tool access: give agents only the tools needed for their role
tool_choice options: "auto" (may return text), "any" (must call a tool), forced selection ({"type": "tool", "name": "..."})

Skills in:

Restricting each subagent’s tool set to its role, preventing cross-specialization misuse
Replacing generic tools with constrained alternatives (e.g., fetch_url replaced by load_document that validates document URLs)
Providing scoped cross-role tools (e.g., verify_fact for synthesis agent) while routing complex cases through coordinator
Using tool_choice forced selection to ensure a specific tool runs first in a pipeline
Setting tool_choice: "any" to guarantee tool use rather than conversational text

Task 2.4: Integrate MCP servers into Claude Code and agent workflows

Knowledge of:

Project-level .mcp.json for shared team tooling vs user-level ~/.claude.json for personal/experimental servers
Environment variable expansion in .mcp.json (e.g., ${GITHUB_TOKEN}) for credential management without committing secrets
Tools from all configured MCP servers discovered at connection time and available simultaneously
MCP resources as content catalogs (issue summaries, documentation hierarchies, database schemas) to reduce exploratory tool calls

Skills in:

Configuring shared MCP servers in project-scoped .mcp.json with environment variable expansion
Configuring personal/experimental servers in user-scoped ~/.claude.json
Enhancing MCP tool descriptions to prevent agents from preferring built-in tools (like Grep) over more capable MCP tools
Choosing existing community MCP servers over custom implementations for standard integrations
Exposing content catalogs as MCP resources for agent visibility without exploratory tool calls

Task 2.5: Select and apply built-in tools (Read, Write, Edit, Bash, Grep, Glob) effectively

Knowledge of:

Grep for content search (function names, error messages, import statements)
Glob for file path pattern matching (finding files by name or extension)
Read/Write for full file operations; Edit for targeted modifications using unique text matching
When Edit fails due to non-unique matches, use Read + Write as a fallback

Skills in:

Selecting Grep for searching code content, Glob for finding files matching naming patterns
Using Read + Write when Edit cannot find unique anchor text
Building codebase understanding incrementally: Grep to find entry points, Read to follow imports and trace flows
Tracing function usage across wrapper modules by identifying exported names, then searching across the codebase

Domain 3: Claude Code Configuration & Workflows (20%)

Task 3.1: Configure CLAUDE.md files with appropriate hierarchy, scoping, and modular organization

Knowledge of:

CLAUDE.md hierarchy: user-level (~/.claude/CLAUDE.md), project-level (.claude/CLAUDE.md or root CLAUDE.md), directory-level (subdirectory CLAUDE.md files)
User-level settings apply only to that user and are not shared via version control
@import syntax for referencing external files to keep CLAUDE.md modular
.claude/rules/ directory for topic-specific rule files as an alternative to monolithic CLAUDE.md

Skills in:

Diagnosing hierarchy issues (e.g., instructions in user-level rather than project-level config)
Using @import to selectively include standards files in each package’s CLAUDE.md
Splitting large CLAUDE.md into focused topic-specific files in .claude/rules/ (e.g., testing.md, api-conventions.md)
Using /memory to verify which files are loaded and diagnose inconsistent behavior

Task 3.2: Create and configure custom slash commands and skills

Knowledge of:

Project-scoped commands in .claude/commands/ (shared via version control) vs user-scoped in ~/.claude/commands/
Skills in .claude/skills/ with SKILL.md frontmatter: context: fork, allowed-tools, argument-hint
context: fork runs skills in isolated sub-agent, preventing output pollution of main conversation
Personal skill customization in ~/.claude/skills/ with different names to avoid affecting teammates

Skills in:

Creating project-scoped slash commands in .claude/commands/ for team-wide availability
Using context: fork to isolate verbose or exploratory skill output
Configuring allowed-tools to restrict tool access during skill execution
Using argument-hint to prompt for required parameters when invoking without arguments
Choosing between skills (on-demand) and CLAUDE.md (always-loaded universal standards)

Task 3.3: Apply path-specific rules for conditional convention loading

Knowledge of:

.claude/rules/ files with YAML frontmatter paths fields containing glob patterns
Path-scoped rules load only when editing matching files, reducing irrelevant context and token usage
Glob-pattern rules are more efficient than directory-level CLAUDE.md files for conventions spanning multiple directories

Skills in:

Creating .claude/rules/ files with frontmatter path scoping (e.g., paths: ["terraform/**/*"])
Using glob patterns to apply conventions by file type regardless of location (e.g., **/*.test.tsx)
Choosing path-specific rules over subdirectory CLAUDE.md when conventions apply across the codebase

Task 3.4: Determine when to use plan mode vs direct execution

Knowledge of:

Plan mode for complex tasks: large-scale changes, multiple valid approaches, architectural decisions, multi-file modifications
Direct execution for simple, well-scoped changes (single-file bug fix, adding a validation check)
Plan mode enables safe exploration before committing to changes
Explore subagent for isolating verbose discovery and returning summaries

Skills in:

Selecting plan mode for architectural tasks (microservice restructuring, 45+ file library migrations)
Selecting direct execution for well-understood changes with clear scope
Using Explore subagent for verbose discovery to prevent context exhaustion
Combining plan mode for investigation with direct execution for implementation

Knowledge of:

Concrete input/output examples are the most effective way to communicate expected transformations when prose is interpreted inconsistently
Test-driven iteration: write test suites first, iterate by sharing test failures
The interview pattern: Claude asks questions to surface considerations the developer may not have anticipated
Batching interacting problems in a single message vs sequential iteration for independent problems

Skills in:

Providing 2-3 concrete I/O examples to clarify transformation requirements
Writing test suites covering expected behavior, edge cases, and performance before implementation
Using the interview pattern to surface design considerations in unfamiliar domains
Providing specific test cases with example input and expected output to fix edge cases
Addressing interacting issues in a single message; sequential iteration for independent issues

Task 3.6: Integrate Claude Code into CI/CD pipelines

Knowledge of:

-p (or --print) flag for running Claude Code non-interactively in automated pipelines
--output-format json and --json-schema CLI flags for structured output in CI contexts
CLAUDE.md provides project context (testing standards, review criteria) to CI-invoked Claude Code
Session context isolation: the same session that generated code is less effective at reviewing its own changes

Skills in:

Running Claude Code in CI with -p flag to prevent interactive input hangs
Using --output-format json with --json-schema for machine-parseable structured findings
Including prior review findings when re-running reviews after new commits (avoid duplicate comments)
Providing existing test files in context to avoid suggesting duplicate test scenarios
Documenting testing standards and available fixtures in CLAUDE.md for CI quality

Domain 4: Prompt Engineering & Structured Output (20%)

Task 4.1: Design prompts with explicit criteria to improve precision and reduce false positives

Knowledge of:

Explicit criteria over vague instructions (e.g., “flag comments only when claimed behavior contradicts actual code behavior” vs “check that comments are accurate”)
General instructions like “be conservative” or “only report high-confidence findings” fail compared to specific categorical criteria
High false positive rates in one category undermine confidence in accurate categories

Skills in:

Writing specific review criteria defining which issues to report (bugs, security) vs skip (minor style, local patterns)
Temporarily disabling high false-positive categories to restore developer trust
Defining explicit severity criteria with concrete code examples for each level

Task 4.2: Apply few-shot prompting to improve output consistency and quality

Knowledge of:

Few-shot examples are the most effective technique for consistently formatted, actionable output
Few-shot examples demonstrate ambiguous-case handling (tool selection, coverage gaps)
Few-shot examples enable generalization to novel patterns, not just matching pre-specified cases
Effective for reducing hallucination in extraction tasks (informal measurements, varied document structures)

Skills in:

Creating 2-4 targeted few-shot examples for ambiguous scenarios showing reasoning for chosen action
Including examples that demonstrate specific desired output format (location, issue, severity, suggested fix)
Providing few-shot examples distinguishing acceptable code patterns from genuine issues
Using few-shot for varied document structures (inline citations vs bibliographies, methodology sections vs embedded details)
Adding few-shot examples showing correct extraction with varied formats to address empty/null fields

Task 4.3: Enforce structured output using tool use and JSON schemas

Knowledge of:

tool_use with JSON schemas is the most reliable approach for guaranteed schema-compliant structured output
tool_choice: "auto" (may return text), "any" (must call a tool), forced selection (must call a specific named tool)
Strict JSON schemas via tool use eliminate syntax errors but do NOT prevent semantic errors (line items do not sum to total, values in wrong fields)
Schema design: required vs optional fields, enum fields with "other" + detail string for extensible categories

Skills in:

Defining extraction tools with JSON schemas as input parameters, extracting structured data from tool_use response
Setting tool_choice: "any" to guarantee structured output when multiple schemas exist
Forcing a specific tool with tool_choice: {"type": "tool", "name": "extract_metadata"} for pipeline ordering
Designing optional (nullable) schema fields to prevent fabrication when source documents lack information
Adding "unclear" and "other" enum values with detail fields for extensible categorization
Including format normalization rules in prompts alongside strict output schemas

Task 4.4: Implement validation, retry, and feedback loops for extraction quality

Knowledge of:

Retry-with-error-feedback: append specific validation errors to the prompt on retry to guide correction
Retries are ineffective when information is simply absent from the source document (vs format/structural errors)
detected_pattern field in findings to enable systematic analysis of dismissal patterns
Difference between semantic validation errors (values do not sum) and schema syntax errors (eliminated by tool use)

Skills in:

Implementing follow-up requests that include the original document, the failed extraction, and specific validation errors
Identifying when retries will be ineffective (missing information) vs effective (format mismatches)
Adding detected_pattern fields to enable analysis of false positive patterns
Designing self-correction validation: calculated_total alongside stated_total, conflict_detected booleans

Task 4.5: Design efficient batch processing strategies

Knowledge of:

Message Batches API: 50% cost savings, up to 24-hour processing window, no guaranteed latency SLA
Appropriate for non-blocking, latency-tolerant workloads (overnight reports, weekly audits, nightly test generation)
NOT appropriate for blocking workflows (pre-merge checks)
Batch API does not support multi-turn tool calling within a single request
custom_id fields for correlating batch request/response pairs

Skills in:

Matching API approach to latency requirements: synchronous for blocking, batch for overnight/weekly
Calculating batch submission frequency based on SLA constraints (e.g., 4-hour windows for 30-hour SLA)
Handling batch failures: resubmit only failed documents (identified by custom_id) with modifications
Using prompt refinement on a sample set before batch-processing large volumes

Task 4.6: Design multi-instance and multi-pass review architectures

Knowledge of:

Self-review limitations: a model retains reasoning context from generation, making it less likely to question its own decisions
Independent review instances (without prior reasoning context) are more effective at catching subtle issues
Multi-pass review: per-file local analysis passes plus cross-file integration passes to avoid attention dilution

Skills in:

Using a second independent Claude instance to review generated code without the generator’s reasoning context
Splitting large multi-file reviews into per-file passes plus integration passes for cross-file data flow
Running verification passes where the model self-reports confidence alongside each finding

Domain 5: Context Management & Reliability (15%)

Task 5.1: Manage conversation context to preserve critical information across long interactions

Knowledge of:

Progressive summarization risks: condensing numerical values, percentages, dates, and customer-stated expectations into vague summaries loses actionable detail
The “lost in the middle” effect: models reliably process information at the beginning and end of long inputs but may omit findings from middle sections
How tool results accumulate in context and consume tokens disproportionately to their relevance (e.g., 40+ fields per order lookup when only 5 are relevant)
The importance of passing complete conversation history in subsequent API requests to maintain conversational coherence

Skills in:

Extracting transactional facts (amounts, dates, order numbers, statuses) into a persistent “case facts” block included in each prompt, outside summarized history
Extracting and persisting structured issue data (order IDs, amounts, statuses) into a separate context layer for multi-issue sessions
Trimming verbose tool outputs to only relevant fields before they accumulate in context
Placing key findings summaries at the beginning of aggregated inputs and organizing detailed results with explicit section headers to mitigate position effects
Requiring subagents to include metadata (dates, source locations, methodological context) in structured outputs to support accurate downstream synthesis
Modifying upstream agents to return structured data (key facts, citations, relevance scores) instead of verbose content when downstream agents have limited context budgets

Task 5.2: Design effective escalation and ambiguity resolution patterns

Knowledge of:

Appropriate escalation triggers: customer requests for a human, policy exceptions/gaps (not just complex cases), and inability to make meaningful progress
The distinction between escalating immediately when a customer explicitly demands it versus offering to resolve when the issue is straightforward
Why sentiment-based escalation and self-reported confidence scores are unreliable proxies for actual case complexity
How multiple customer matches require clarification (requesting additional identifiers) rather than heuristic selection

Skills in:

Adding explicit escalation criteria with few-shot examples to the system prompt demonstrating when to escalate versus resolve autonomously
Honoring explicit customer requests for human agents immediately without first attempting investigation
Acknowledging frustration while offering resolution when the issue is within the agent’s capability, escalating only if the customer reiterates their preference
Escalating when policy is ambiguous or silent on the customer’s specific request (e.g., competitor price matching when policy only addresses own-site adjustments)
Instructing the agent to ask for additional identifiers when tool results return multiple matches, rather than selecting based on heuristics

Task 5.3: Implement error propagation strategies across multi-agent systems

Knowledge of:

Structured error context (failure type, attempted query, partial results, alternative approaches) enables intelligent coordinator recovery decisions
The distinction between access failures (timeouts needing retry decisions) and valid empty results (successful queries with no matches)
Why generic error statuses (“search unavailable”) hide valuable context from the coordinator
Why silently suppressing errors (returning empty results as success) or terminating entire workflows on single failures are both anti-patterns

Skills in:

Returning structured error context including failure type, what was attempted, partial results, and potential alternatives to enable coordinator recovery
Distinguishing access failures from valid empty results in error reporting so the coordinator can make appropriate decisions
Having subagents implement local recovery for transient failures and only propagate errors they cannot resolve, including what was attempted and partial results
Structuring synthesis output with coverage annotations indicating which findings are well-supported versus which topic areas have gaps due to unavailable sources

Task 5.4: Manage context effectively in large codebase exploration

Knowledge of:

Context degradation in extended sessions: models start giving inconsistent answers and referencing “typical patterns” rather than specific classes discovered earlier
The role of scratchpad files for persisting key findings across context boundaries
Subagent delegation for isolating verbose exploration output while the main agent coordinates high-level understanding
Structured state persistence for crash recovery: each agent exports state to a known location, and the coordinator loads a manifest on resume

Skills in:

Spawning subagents to investigate specific questions (e.g., “find all test files,” “trace refund flow dependencies”) while the main agent preserves high-level coordination
Having agents maintain scratchpad files recording key findings, referencing them for subsequent questions to counteract context degradation
Summarizing key findings from one exploration phase before spawning sub-agents for the next phase, injecting summaries into initial context
Designing crash recovery using structured agent state exports (manifests) that the coordinator loads on resume and injects into agent prompts
Using /compact to reduce context usage during extended exploration sessions when context fills with verbose discovery output

Task 5.5: Design human review workflows and confidence calibration

Knowledge of:

The risk that aggregate accuracy metrics (e.g., 97% overall) may mask poor performance on specific document types or fields
Stratified random sampling for measuring error rates in high-confidence extractions and detecting novel error patterns
Field-level confidence scores calibrated using labeled validation sets for routing review attention
The importance of validating accuracy by document type and field segment before automating high-confidence extractions

Skills in:

Implementing stratified random sampling of high-confidence extractions for ongoing error rate measurement and novel pattern detection
Analyzing accuracy by document type and field to verify consistent performance across all segments before reducing human review
Having models output field-level confidence scores, then calibrating review thresholds using labeled validation sets
Routing extractions with low model confidence or ambiguous/contradictory source documents to human review, prioritizing limited reviewer capacity

Task 5.6: Preserve information provenance and handle uncertainty in multi-source synthesis

Knowledge of:

How source attribution is lost during summarization steps when findings are compressed without preserving claim-source mappings
The importance of structured claim-source mappings that the synthesis agent must preserve and merge when combining findings
How to handle conflicting statistics from credible sources: annotating conflicts with source attribution rather than arbitrarily selecting one value
Temporal data: requiring publication/collection dates in structured outputs to prevent temporal differences from being misinterpreted as contradictions

Skills in:

Requiring subagents to output structured claim-source mappings (source URLs, document names, relevant excerpts) that downstream agents preserve through synthesis
Structuring reports with explicit sections distinguishing well-established findings from contested ones, preserving original source characterizations and methodological context
Completing document analysis with conflicting values included and explicitly annotated, letting the coordinator decide how to reconcile before passing to synthesis
Requiring subagents to include publication or data collection dates in structured outputs to enable correct temporal interpretation
Rendering different content types appropriately in synthesis outputs — financial data as tables, news as prose, technical findings as structured lists — rather than converting everything to a uniform format

Sample Questions

The exam guide includes 9 sample questions across 3 scenarios. They illustrate the exam format, difficulty level, and reasoning patterns.

Scenario: Customer Support Resolution Agent

Q1 — Programmatic prerequisites beat prompt enforcement. Agent skips get_customer 12% of the time, calling lookup_order using the customer’s stated name, causing misidentified accounts. Answer: Add a programmatic prerequisite that blocks lookup_order and process_refund calls until get_customer returns a verified customer ID. Prompts and few-shot examples are probabilistic (Options B/C); routing classifiers address tool availability, not ordering (Option D). (Tasks 1.4, 1.5)

Q2 — Tool descriptions fix misrouting. Agent calls get_customer when users ask about orders (“check my order #12345”) because both tools have minimal, overlapping descriptions. Answer: Expand each tool’s description to include input formats, example queries, edge cases, and boundaries explaining when to use it versus alternatives. Few-shot examples (A) add token overhead without fixing the root cause; routing layers (C) are over-engineered; consolidation (D) requires more effort than the problem warrants. (Task 2.1)

Q3 — Explicit escalation criteria with few-shot. Agent achieves 55% first-contact resolution (target: 80%), escalating straightforward cases while attempting complex policy exceptions autonomously. Answer: Add explicit escalation criteria with few-shot examples demonstrating when to escalate vs resolve. Self-reported confidence (B) is poorly calibrated on hard cases; separate classifier (C) is over-engineered; sentiment analysis (D) does not correlate with case complexity. (Task 5.2)

Scenario: Code Generation with Claude Code

Q4 — Project-scoped commands in .claude/commands/. Creating a /review slash command available to every developer who clones the repo. Answer: Store it in .claude/commands/ in the project repository. ~/.claude/commands/ (B) is personal and not shared; CLAUDE.md (C) is for instructions, not command definitions; .claude/config.json with commands array (D) does not exist. (Task 3.2)

Q5 — Plan mode for complex restructuring. Restructuring a monolith into microservices, involving dozens of files and architectural decisions. Answer: Enter plan mode to explore the codebase, understand dependencies, and design an approach before making changes. Direct execution (B/C/D) risks costly rework when dependencies are discovered late. (Task 3.4)

Q6 — .claude/rules/ for path-specific conventions. Codebase has distinct coding conventions (React functional, async API handlers, repository-pattern database models). Test files follow the same conventions regardless of location. Answer: Create .claude/rules/ files with YAML frontmatter glob patterns (e.g., **/*.test.tsx) for automatic conditional application. Root CLAUDE.md (B) relies on inference; skills (C) require manual invocation; subdirectory CLAUDE.md (D) cannot handle files spread across directories. (Task 3.3)

Scenario: Multi-Agent Research System

Q7 — Coordinator decomposition is the root cause. Research on “AI in creative industries” covers only visual arts (digital art, graphic design, photography), completely missing music, writing, and film. Each subagent works correctly within its assigned scope. Answer: The coordinator’s task decomposition is too narrow. It decomposed “creative industries” into only visual arts subtasks. The subagents executed correctly — the problem is what they were assigned. (Task 1.2)

Q8 — Structured error context for recovery. Web search subagent times out during research. Answer: Return structured error context to the coordinator including failure type, attempted query, partial results, and potential alternative approaches. Generic “search unavailable” status (B) hides context; catching timeout and returning empty results (C) suppresses the error; propagating exceptions (D) terminates the entire workflow unnecessarily. (Task 5.3)

Q9 — Scoped verify_fact tool for synthesis agents. Synthesis agent needs to verify facts (85% are simple lookups). Currently round-trips through coordinator to web search agent, adding 2-3 round trips and 40% latency. Answer: Give the synthesis agent a scoped verify_fact tool for simple lookups, while complex verifications continue through the coordinator. Batching all verifications (B) delays synthesis; full web search access (C) gives too many tools, risking cross-specialization misuse; proactive caching (D) requires anticipating verification needs. (Task 2.3)

Additional Sample Questions (from study guide)

Q10 — -p flag for CI pipelines. Pipeline hangs because Claude Code waits for interactive input. Answer: Use claude -p "..." flag for non-interactive mode. It processes the prompt, prints to stdout, and exits. (Task 3.6)

Q11 — Batch API for overnight work only. Manager proposes moving both pre-merge checks and overnight tech-debt reports to batch for 50% savings. Answer: Use batch only for tech-debt reports; keep real-time for pre-merge checks. Batch API has up to 24-hour processing with no latency SLA. (Task 4.5)

Q12 — Multi-pass review for large PRs. A 14-file single-pass review produces inconsistent depth, missed bugs, and contradictory feedback. Answer: Split into per-file local analysis passes plus a separate cross-file integration pass. This directly addresses attention dilution. (Task 4.6)

Sample Question Patterns

These patterns recur across the 9 sample questions:

Deterministic enforcement beats prompt instructions. When a business rule must be guaranteed (e.g., refund limits), the answer is always application-layer intercepts or hooks, never prompt-based restrictions
Tool descriptions are the fix for misrouting. When an agent selects the wrong tool, the answer is to improve tool descriptions (rename, add boundary explanations, differentiate from similar tools)
Plan mode for complex tasks. Multi-file changes, architectural decisions, and tasks with multiple valid approaches should use plan mode
.claude/rules/ for path-specific conventions. When conventions apply to specific file types across a codebase, path-specific rules beat directory-level CLAUDE.md files
Coordinator decomposition is the root cause. When a multi-agent system produces incomplete results, the fix is usually in how the coordinator decomposes the task (too narrow scope, missing source types)
Structured error context for recovery. When agents fail, returning structured error information (category, retryability, human description) enables appropriate recovery decisions
Scoped verify_fact tools for synthesis agents. Giving a synthesis agent a limited fact-verification tool (not full search) prevents cross-specialization misuse
Proportionate solutions win. The correct answer is the most targeted fix, not the most elaborate architecture. Avoid over-engineering when a simpler solution addresses the root cause

Key Takeaways

The CCA-F is a 60-question multiple choice exam requiring a scaled score of 720/1000 to pass
Agentic Architecture (27%) is the single most important domain — nearly a third of the exam
The exam uses 6 realistic production scenarios; you get 4 randomly selected per sitting, so you must prepare for all 6
Every domain has specific task statements with explicit knowledge and skill requirements — this guide is the authoritative checklist
The exam consistently favors deterministic/structural solutions (hooks, schemas, machine IDs) over probabilistic approaches (prompts, retries)
Sample questions demonstrate a pattern: the correct answer is always the most architecturally sound approach, not the quickest fix
No penalty for guessing means you should answer every question

Claude Certified Architect — Foundations (CCA-F)
CCA-F Technical Reference — deep technical content with code patterns and implementation details
CCA-F Practice Exam (60 Questions) — 60-question practice test matching exam format
CCA-F Study Guide
CCA-F Practice Questions by Domain
The Architect’s Playbook
Anthropic Claude Cookbooks
Claude Code Subagents
Essential MCP Servers for 2026
Skill Design Patterns

Try It

Print this article and use it as a checklist — every task statement is a study topic. Mark each one as you study it
For each task statement, write down one concrete example from your own experience that demonstrates the knowledge or skill
Identify your weakest 3 task statements and build a small project that exercises each one
Practice with all 6 scenarios — for each scenario, identify which task statements are most likely to be tested and prepare accordingly
Take the sample questions and for each one, identify the exact task statement it maps to. This builds the mental model for recognizing what the exam is testing
Cross-reference with The Architect’s Playbook for visual pattern explanations that reinforce these concepts

Jonathon's AI Wiki

Explorer

CCA-F Official Exam Guide

Target Candidate

Exam Format

The Six Exam Scenarios

Scenario 1: Customer Support Resolution Agent

Scenario 2: Code Generation with Claude Code

Scenario 3: Multi-Agent Research System

Scenario 4: Developer Productivity with Claude

Scenario 5: Claude Code for CI/CD

Scenario 6: Structured Data Extraction

Domain 1: Agentic Architecture & Orchestration (27%)

Task 1.1: Design and implement agentic loops for autonomous task execution

Task 1.2: Orchestrate multi-agent systems with coordinator-subagent patterns

Task 1.3: Configure subagent invocation, context passing, and spawning

Task 1.4: Implement multi-step workflows with enforcement and handoff patterns

Task 1.5: Apply Agent SDK hooks for tool call interception and data normalization

Task 1.6: Design task decomposition strategies for complex workflows

Task 1.7: Manage session state, resumption, and forking

Domain 2: Tool Design & MCP Integration (18%)

Task 2.1: Design effective tool interfaces with clear descriptions and boundaries

Task 2.2: Implement structured error responses for MCP tools

Task 2.3: Distribute tools appropriately across agents and configure tool choice

Task 2.4: Integrate MCP servers into Claude Code and agent workflows

Task 2.5: Select and apply built-in tools (Read, Write, Edit, Bash, Grep, Glob) effectively

Domain 3: Claude Code Configuration & Workflows (20%)

Task 3.1: Configure CLAUDE.md files with appropriate hierarchy, scoping, and modular organization

Task 3.2: Create and configure custom slash commands and skills

Task 3.3: Apply path-specific rules for conditional convention loading

Task 3.4: Determine when to use plan mode vs direct execution

Task 3.5: Apply iterative refinement techniques for progressive improvement

Task 3.6: Integrate Claude Code into CI/CD pipelines

Domain 4: Prompt Engineering & Structured Output (20%)

Task 4.1: Design prompts with explicit criteria to improve precision and reduce false positives

Task 4.2: Apply few-shot prompting to improve output consistency and quality

Task 4.3: Enforce structured output using tool use and JSON schemas

Task 4.4: Implement validation, retry, and feedback loops for extraction quality

Task 4.5: Design efficient batch processing strategies

Task 4.6: Design multi-instance and multi-pass review architectures

Domain 5: Context Management & Reliability (15%)

Task 5.1: Manage conversation context to preserve critical information across long interactions

Task 5.2: Design effective escalation and ambiguity resolution patterns

Task 5.3: Implement error propagation strategies across multi-agent systems

Task 5.4: Manage context effectively in large codebase exploration

Task 5.5: Design human review workflows and confidence calibration

Task 5.6: Preserve information provenance and handle uncertainty in multi-source synthesis

Sample Questions

Scenario: Customer Support Resolution Agent

Scenario: Code Generation with Claude Code

Scenario: Multi-Agent Research System

Additional Sample Questions (from study guide)

Sample Question Patterns

Key Takeaways

Related

Try It

Graph View

Table of Contents

Backlinks