Source: X Bookmark 1959298080156598637 (God of Prompt tweet pointing to the cookbook) → Openai Gpt 5 Prompting Guide 2026 05 02 (full guide content extracted from cookbook.openai.com)

This wiki is Claude-first, but several practical prompt patterns in OpenAI’s official GPT-5 Prompting Guide transfer directly to Anthropic models — particularly the eagerness-control patterns, tool preamble discipline, Cursor’s prompt-tuning case study, and the instruction-conflict warning. This article captures the cross-vendor takeaways without re-summarizing OpenAI-specific API parameter names. For the canonical version, link out to https://cookbook.openai.com/examples/gpt-5/gpt-5_prompting_guide.

Key Takeaways

  • Eagerness is steerable. GPT-5 (like Claude Opus 4.7) operates anywhere from “delegate most decisions” to “tight programmatic leash.” Patterns:
    • Less eagerness: lower reasoning_effort, define explicit early-stop criteria (“top hits converge ~70% on one path”), set fixed tool-call budgets, give an escape hatch (“even if it might not be fully correct”).
    • More eagerness: higher reasoning_effort, agentic-persistence prompts (“keep going until the user’s query is completely resolved”, “never stop or hand back when you encounter uncertainty”, “decide what the most reasonable assumption is, proceed, document”).
  • Tool preambles are universal. Pattern: rephrase the user’s goal → outline a structured plan → narrate each step → summarize completed work distinct from the upfront plan. Improves long-rollout UX in any reasoning-model interface.
  • GPT-5’s reasoning_effort and verbosity parameters separate thinking from output length. Claude has the analogous split via Extended Thinking effort tiers and natural-language verbosity overrides — cross-reference Opus 4.7 best practices for the equivalent on the Anthropic side.
  • Contradictory prompts hurt GPT-5 more than they hurt non-reasoning models. GPT-5 expends reasoning tokens trying to reconcile conflicting instructions rather than picking one at random. Concrete example given: a healthcare prompt with “never schedule without explicit patient consent” alongside “auto-assign earliest same-day slot without contacting the patient.” Rule of thumb: audit every multi-stakeholder prompt for internal contradictions; the older the prompt, the higher the odds. This is the same warning the Claude troubleshooting reference gives for Opus 4.7.
  • Cursor’s prompt-tuning case study is portable.
    1. Verbosity split: set global verbosity: low for chat-friendly status updates and prompt-explicit Use high verbosity for writing code and code tools so the diffs stay readable. Single-letter variable names disappear.
    2. Don’t defer; act: make explicit that “code edits will be displayed as proposed changes” so the model knows it can be proactive — “almost never ask the user whether to proceed with a plan; instead proactively attempt the plan and ask if they want to accept the implemented changes.”
    3. Lighten thoroughness language for newer models: old “Be THOROUGH… maximize…” prompts that worked on GPT-4-class models over-call search on GPT-5. Soften the language; remove the maximize_ prefix. Same lesson likely applies to Opus 4.7 — it’s already proactive at gathering context, and aggressive thoroughness exhortations push it past the useful point.
    4. Structured XML specs (<[instruction]_spec>) improve instruction adherence and let you reference categories/sections elsewhere in the prompt cleanly.
  • Self-rubric prompting (zero-to-one app generation): “First, spend time thinking of a rubric until you are confident. Create 5–7 categories — do not show the user. Use the rubric internally to think and iterate; if not hitting top marks across all categories, start again.” Generic technique — works on Opus 4.7 too for high-stakes generation.
  • Codebase-rules block. Without prompting, GPT-5 already searches reference context (reads package.json). Behavior is sharpened by an explicit rules block summarizing engineering principles, directory structure, and design taste. Same pattern aligns with Simon Scrapes’ static-context split and the standard CLAUDE.md file.
  • Markdown formatting is opt-in. GPT-5 in the API does not format final answers in Markdown by default (max compatibility). Prompt explicitly when you want it. For long conversations, append a Markdown reminder every 3–5 user messages — adherence degrades over rollouts.
  • Metaprompting works. Use the model itself to optimize prompts — feed it the prompt + the desired/undesired behavior gap and ask for minimal edits. Generic across reasoning models.
  • Responses API gives measurable agentic gains on the OpenAI side (Tau-Bench Retail 73.9% → 78.2% just by switching to Responses API + previous_response_id). On the Anthropic side, the equivalent is Extended Thinking with reasoning persistence and prompt caching.

OpenAI-only details (skip for Claude work)

These are GPT-5–specific and don’t transfer:

  • apply_patch is the canonical file-edit tool format that GPT-5 is trained against.
  • minimal reasoning effort tier as the latency-sensitive replacement for GPT-4.1.
  • Specific frontend framework recommendations: Next.js + Tailwind + shadcn + Radix + Motion + Lucide.
  • The Tau-Bench Retail and Terminal-Bench example prompts in the appendix.

(For Claude-side equivalents on tool-use formats, see Claude Code CLI reference and Extended Thinking API.)

Cross-vendor decision: when to read the GPT-5 guide

  • Always read it once if you author prompts for any reasoning model. Sections on eagerness control, tool preambles, instruction conflicts, and metaprompting are the highest-leverage takeaways and they’re vendor-agnostic.
  • Read it again before authoring a new long-rollout agent prompt. Even if you’ll deploy on Opus 4.7, the failure modes GPT-5 exposes (reasoning-token waste on contradictions, over-eager search) are common to all modern reasoning models.
  • Skim the Cursor case study before tuning a coding-agent system prompt — the verbosity-split pattern and the don’t-defer-to-the-user rule generalize.

Try It

  1. Audit a long-running WEO Claude prompt (e.g. the Hermes system prompt) for internal contradictions using GPT-5’s example healthcare prompt as the template — look for “always X” + “always not X” pairs that drift in over edits.
  2. On a coding skill prompt, try the verbosity split: globally low + explicit-high for code blocks. Compare diff readability.
  3. On any zero-to-one generation task, try self-rubric prompting: “First, think of a 5–7-category rubric for a world-class result. Don’t show the rubric. Use it internally to iterate. If not hitting top marks, start again.” Compare quality to a non-rubric baseline.
  4. For your most-painful prompt, try metaprompting: paste the prompt + the desired/undesired behavior gap to Opus 4.7 (or GPT-5) and ask for minimal edits. Iterate.

Open Questions

  • The OpenAI guide’s claim that GPT-5 is “extraordinarily receptive to prompt instructions” — does this hold equally for Opus 4.7? Anecdotally yes, but no head-to-head A/B on identical prompts is published.
  • Does Opus 4.7’s effort-tier system show measurable gains analogous to the Tau-Bench 73.9% → 78.2% result with the Responses API? Worth a benchmark experiment.