Source: raw/reddit-1um5er6.md (r/Anthropic surfacing, score 380) + ai-research/openai-gpt-5-6-sol-preview.md, verified against the official primary source https://openai.com/index/previewing-gpt-5-6-sol/ and the OpenAI Help Center (help.openai.com/en/articles/20001325).

OpenAI’s GPT-5.6 ships as a three-tier family instead of one model — Sol (flagship), Terra (balanced), and Luna (high-volume, cheapest) — announced in limited preview in late June 2026^[inferred — multiple secondary reports date it June 2026; the official-page extract does not state a date] and surfaced on r/Anthropic on 2026-07-03 as “competition for Fable 5.” The practical story for anyone routing spend across models is the pricing spread: Sol lands at 30 per 1M tokens — half of Claude Fable 5’s 50 on input, 40% cheaper on output — while Terra and Luna open two cheaper rungs inside the same generation. This article treats GPT-5.6 as a competitor datapoint for model/cost routing, not as an endorsement.

Key Takeaways

  • Three tiers, per-1M-token pricing (VERIFIED against the official OpenAI page and Help Center): Sol $5 / $30, Terra $2.50 / $15, Luna $1 / $6 (input / output). Terra is exactly half of Sol; Luna is the new low-cost rung.
  • “Sol Ultra” is a mode, not a tier. The Reddit stub labeled the 30 price “Sol Ultra.” Corrected on verification: $5/$30 is the price of Sol. max and ultra are reasoning-effort settings (ultra spins up subagents); “GPT-5.6 Sol Ultra” is Sol run at the top effort level, which reportedly posts the top Terminal-Bench score^[secondary: DataCamp, explainx — not in the official-page extract].
  • Sol matches GPT-5.5’s rate card exactly (30) — the generational value is in the cheaper Terra/Luna rungs, not a top-tier price cut^[secondary: eesel, finout, VentureBeat].
  • The naming drops “mini”/“nano.” OpenAI reportedly reframed the family around use case rather than model size; there is no mini/nano variant and no “Pro” tier yet^[secondary: VentureBeat, eesel].
  • More predictable prompt caching (5.6+): explicit cache breakpoints, 30-minute minimum cache life, cache writes billed at 1.25x the uncached input rate, cache reads keep the 90% discount (official).
  • Cyber-forward: Sol is OpenAI’s “most capable model yet for cybersecurity” — on ExploitBench, competitive with Anthropic’s Mythos Preview using only ~1/3 the output tokens; on UC-Berkeley-built ExploitGym all three tiers improve with more reasoning (official). This is the same capability class that triggers Anthropic’s Mythos-family gating.
  • Availability is gated. Preview is API + Codex only, restricted to ~20 organizations whose participation was cleared under a US-government frontier-model review; broad ChatGPT access has no firm date^[secondary: Handy AI, VentureBeat] — see Mythos 5 federal story.
  • Speed play: GPT-5.6 Sol on Cerebras at up to 750 tokens/sec in July, initially for select customers (official).

Pricing

GPT-5.6 tiers (VERIFIED — official OpenAI page + Help Center), with the closest wiki-tracked model neighbors for routing context:

ModelInput /1MOutput /1MRole / notes
GPT-5.6 Sol (gpt-5.6-sol)$5.00$30.00OpenAI flagship; frontier coding / cyber / agentic. max + ultra effort modes.
GPT-5.6 Terra (gpt-5.6-terra)$2.50$15.00Balanced production; exactly half of Sol.
GPT-5.6 Luna (gpt-5.6-luna)$1.00$6.00High-volume / cheap; classification, routing, moderation, summarization.
Claude Fable 5$10.00$50.00Anthropic Mythos-class flagship (the comparison anchor).
Claude Opus 4.8$5.00$25.00Fable’s safeguard-fallback model; Sol matches it on input, undercuts on output.
Claude Sonnet 5$3.00$15.00Std pricing post-Aug-31 (10 promo before); ties Terra on output.
GLM-5.2 (Z.ai)~$1.40~$4.40Open-weight value tier; undercuts even Luna^[creator-cited/approximate; corroborated by VentureBeat’s comparison table].

Caching detail (5.6+, official): explicit cache breakpoints, 30-min minimum cache life, cache writes at 1.25x uncached input, cache reads at the 90% discount. For agent loops that reuse system prompts and tool schemas, the caching terms move total cost as much as the sticker rate — especially in ultra mode where subagent calls multiply.

What It Means for Model & Cost Routing

The competitor read against Fable 5, framed the way the wiki already frames Anthropic’s own lineup:

  • Sol vs Fable 5 at the frontier tier: Sol is $5/$30 vs Fable 5’s $10/$50 — 50% cheaper input, 40% cheaper output — and OpenAI’s own ExploitBench line claims frontier-competitive cyber output at ~1/3 the tokens. Sticker price alone doesn’t decide it: Fable 5 is token-hungry (500k–1M+ token sessions) and the wiki’s standing guidance is to reserve it for the heavy/long-horizon tail. If Sol holds capability, it pressures exactly that “worth the token cost” calculus.
  • Terra/Luna reproduce the “route down a tier” pattern OpenAI-side. The wiki’s Fable 5 guidance — “route routine sessions (compiles, inbox-refresh, lint sweeps) back to Opus 4.8, keep Fable 5 for the heavy tail” — has a direct GPT-5.6 analogue: keep Sol for the hard tail, push balanced production to Terra (15, undercutting Fable 5 ~4x on input), and dump high-volume classification/routing on Luna (6). One vendor, three cost rungs, no cross-provider integration cost.^[inferred routing synthesis]
  • Safety-classifier routing is the Reddit poster’s actual point. GPT-5.6’s preview safeguards include real-time cyber/biology misuse classifiers that can pause generation for larger-model review, plus account-level abuse review^[secondary: explainx]. That mechanically parallels Fable 5’s topic-gated Opus 4.8 fallback (documented in claude-fable-5-mythos-5): both frontier vendors now interpose a routing/gating layer that can divert or slow a request mid-flight. Budget for the friction (blocks, refusals, latency on legitimate dual-use security work) on either stack.
  • Availability caveat that outranks price. Sol/Terra/Luna are preview-gated to ~20 US-cleared orgs via API + Codex only^[secondary]. For most teams the routable choice today is still Fable 5 / Opus 4.8 / Sonnet 5 / GLM-5.2; GPT-5.6 is a datapoint to price against, not yet a lever to pull. See fable-5-mythos-5-federal-shutdown for the mirror-image export-control story on Anthropic’s side.

Benchmarks & Safeguards

  • ExploitBench (official): Sol competitive with Mythos Preview at ~1/3 the output tokens; evaluated with the ExploitBench API harness, 5 seeds, reasoning continuity.
  • ExploitGym (official, UC Berkeley + OpenAI + other labs): all three tiers show strong cyber gains as reasoning increases.
  • Terminal-Bench: “GPT-5.6 Sol Ultra” reportedly posts the top score^[secondary: DataCamp, explainx]. For the live board the wiki already tracks — Codex CLI + GPT-5.5 (83.4%) and Claude Code + Fable 5 (83.1%) tied for the lead at the last capture — see terminal-bench.
  • Preparedness Framework^[secondary: Handy AI, single source]: all three rated High for Cybersecurity and High for Biological & Chemical, Below High for AI Self-Improvement. The system card reportedly notes Sol “were unable to carry out autonomous, end-to-end attacks against hardened targets” and shows “a greater tendency than GPT-5.5 to go beyond the user’s intent” in internal coding traffic — an overeager/destructive-action note that rhymes with the Mythos-family self-guards.
  • Safeguard stack (preview)^[secondary: explainx]: model-trained cyber refusals; real-time cyber/biology misuse classifiers (can pause for larger-model review); account-level review; 700,000+ A100-equivalent GPU-hours of automated red-teaming for universal jailbreaks; ongoing third-party human red-teaming.

For the independent counter-read on whether these Mythos-class cyber gains are overhyped, the same lens applies here as to Anthropic: see Epoch AI’s Cyber-ECI analysis (exploit-development jumps are real; vulnerability-discovery gains are easy to overstate).

  • claude-fable-5-mythos-5 — Anthropic’s Mythos-class flagship (50) and the direct frontier-tier comparison anchor; also the source of the topic-gated Opus 4.8 fallback this piece parallels.
  • claude-sonnet-5 — Anthropic’s mid-tier (15 std) that ties Terra on output; the “route down a tier” neighbor on the Claude side.
  • glm-5-series-zai — open-weight value frontier (~4.40) that undercuts even Luna; the price-floor competitor.
  • gemini-cli — Google’s competing frontier-model CLI surface; the third major closed-vendor agentic stack.
  • fable-5-mythos-5-federal-shutdown — the mirror-image US-government frontier-model gate on Anthropic; context for GPT-5.6’s ~20-org preview restriction.
  • terminal-bench — the agentic benchmark backdrop where GPT-5.x and Claude trade the lead.
  • epoch-mythos-cyber-capabilities-overhyped — independent cyber-capability read; the counter-lens for Sol’s ExploitBench claims.
  • _index — competitor-model / benchmark tracking topic index.

Open Questions

  • Standalone benchmark numbers. The official-page extract gives cyber framing (ExploitBench/ExploitGym) but no SWE-bench / Terminal-Bench figures; the “Sol Ultra tops Terminal-Bench” and Preparedness ratings are secondary-sourced. Verify against the GPT-5.6 system card when broadly available.
  • Context window. Sol’s 1.5M-token window (up ~43% from GPT-5.5 Pro’s 1.05M) is single-secondary-sourced (Handy AI) — confirm on the official model page. Terra/Luna context not stated.
  • Long-context pricing tier. GPT-5.5 bumped input rate past ~272K tokens; GPT-5.6 preview lists a single flat rate per model — whether a long-context surcharge returns at GA is unknown^[secondary: eesel].
  • GA date and ChatGPT availability. Preview is API + Codex only for ~20 US-cleared orgs; no firm broad-availability date.
  • Head-to-head vs Fable 5 on non-cyber tasks. OpenAI’s public comparison is cyber-weighted; coding/agentic parity vs Fable 5 / Opus 4.8 outside the security domain is not yet established from primary sources.