Source: wiki synthesis: certification-technical-reference

Consolidated reference for the core prompt-engineering patterns Anthropic teaches in the CCA-F curriculum. Six techniques cover most production use cases: few-shot prompting, explicit criteria, prompt chaining, the interview pattern, validation with retry, and self-correction. Every pattern below is production-proven at Anthropic and their customers — not theoretical.

Key Takeaways

  • Examples beat descriptions. 2–4 input/output pairs show Claude what you want more reliably than prose instructions.
  • Explicit lists of flag/don’t-flag criteria outperform vague guidance like “be conservative.”
  • Break complex tasks into chained steps — attention dilutes across too many responsibilities in a single prompt.
  • Retries with feedback fix format errors, not missing information. Know which failure modes are retryable.
  • Self-correction via redundant extraction (extract stated_total and calculated_total, compare) catches arithmetic and reasoning errors before they reach the user.
  • The interview pattern — having Claude ask clarifying questions before implementing — is the right default for unfamiliar domains.

Few-Shot Prompting

Include 2–4 input/output examples to demonstrate expected behavior. Examples are more effective than textual descriptions because they unambiguously show format and decision logic.

Use cases:

  • Ambiguous scenario handling (show how to classify edge cases)
  • Output formatting (show exact JSON/markdown/CSV shape)
  • Acceptable vs problematic code (show both positive and negative examples)
  • Extraction from different document formats (show one PDF-style example, one HTML-style example, etc.)
  • Informal measurement normalization (show “about 3 feet” → 0.91 m)

Format normalization rules belong alongside strict schemas to prevent semantic errors:

  • Dates to ISO 8601
  • Currency to numeric + ISO 4217 code
  • Percentages to decimal fractions

Explicit Criteria vs Vague Instructions

Bad: “Check code comments for accuracy. Be conservative.”

Good: “Flag a comment as problematic ONLY if: (1) described behavior contradicts actual code behavior, (2) references a non-existent function, (3) TODO refers to a fixed bug. Do NOT flag: stylistically outdated, minor wording inaccuracies, missing comments.”

The good version defines three flag conditions and three don’t-flag conditions. Claude cannot “interpret” its way into a wrong answer.

Prompt Chaining

Break complex tasks into sequential focused steps to avoid attention dilution:

Step 1: Analyze auth.ts (local issues) -> list
Step 2: Analyze database.ts (local issues) -> list
Step 3: Integration pass (cross-file) -> cross-cutting issues

When to chain: predictable pipelines with fixed structure (code review, extraction, multi-document analysis).

When to use dynamic decomposition instead: open-ended investigations where the steps are not known in advance (let an agent loop decide its own next step).

The Interview Pattern

Claude asks clarifying questions before starting implementation. Useful when:

  • The domain is unfamiliar to Claude
  • Task implications are non-obvious
  • Multiple viable approaches exist

Prompt seed: “Before writing any code, ask me up to 5 clarifying questions about requirements, constraints, and preferences you’re uncertain about.”

Validation and Retry-with-Feedback

When extracted data fails validation, retry with: (1) the original document, (2) the previous extraction, and (3) the specific validation error.

Retries work for:

  • Format errors (wrong type, missing required field)
  • Structural errors (nesting, ordering)
  • Arithmetic inconsistencies

Retries do NOT work for:

  • Information genuinely absent from the source
  • Context that lives only in external documents not provided

Validation stack (Python): Pydantic for structural validation (types, requiredness), custom validators for semantic validation, validate-retry loops, and JSON Schema generation that feeds into tool_use definitions.

Self-Correction Pattern

Design the schema so Claude’s output can check itself. Classic example: ask for both stated_total (what the document says) and calculated_total (what the line items sum to). If they differ, set conflict_detected: true and flag for human review.

This moves error detection from post-hoc validation into the extraction itself — Claude catches its own arithmetic mistakes because the schema forced it to compute both sides.

Try It

  • Rewrite one vague prompt you already use in production with explicit flag/don’t-flag criteria. Measure delta in false positives.
  • Add 3 few-shot examples to your highest-volume extraction prompt. Cover: happy path, edge case, failure case.
  • Add a conflict_detected boolean to the next structured-output schema you design that involves arithmetic or cross-referencing.
  • Chain a prompt that’s currently doing two things at once — split into two prompts and measure whether accuracy improves on each.

Open Questions

  • How do these six patterns interact with prompt caching? Are chained prompts cache-friendly if they share a system prefix?
  • What’s the measured quality delta between few-shot and zero-shot for Claude Opus 4.6 specifically? (The CCA-F source doesn’t cite numbers.)
  • Does extended thinking reduce the need for self-correction schemas, or do they compound?