Source: raw/html-effectiveness-thariq-2026-05-11.mdthariqs.github.io/html-effectiveness/ (fetched 2026-05-11). Companion gallery to a blog post by Thariq Shihipar (handle thariqs on GitHub Pages, @trq212 on X, Claude Code team at Anthropic) showcasing 20 self-contained .html files an agent produced instead of a wall of markdown, grouped into 9 use-case sections. 2026-05-20 refresh adds the author’s first-person walkthrough on Claire Vo’s How I AI podcast (raw/Why_this_Claude_Code_engineer_uses_HTML_files_as_AI_specs_Thariq_Shihipar_Anthropic.md) — see § Author Walkthrough below for the brainstorm-in-HTML → plan-in-HTML → throwaway-custom-UI flow + compute-allocator framing + designsystem.html pattern. 2026-05-31 refresh adds an AI-Garage (onchainaigarage.com) creator breakdown (raw/HTML_vs._Markdown_-_Pushing_AI_Agent_Responses_to_the_Next_Level.md) — see § Karpathy’s Endorsement + Counterargument Ledger below for Karpathy’s tweet ratification, the output-format progression, the vision-bandwidth argument, and the explicit objection-and-rebuttal list (including Thariq’s diff-noise concession).

A short, opinionated argument — backed by 20 working examples — that the default “agent returns a wall of markdown” output mode wastes the medium. Spatial information (diffs, module maps, design tokens, timelines, flowcharts, animation curves, multi-pane comparisons) is flattened by markdown but rendered natively by HTML. The companion gallery lays out concrete demos: “ask the agent for this instead, and you’ll actually read it.” Pairs naturally with Claude Design (which makes the same argument as a productized canvas), frontend-slides (one HTML preset family for decks), beautiful-html-templates (30-template AGENTS.md gallery), and Impeccable v3 (37-pattern AI-slop catalog for what NOT to do when generating HTML).

Key Takeaways

  • Thesis in one line. “Twenty self-contained .html files an agent produced instead of a wall of markdown. Each one trades a document you’d skim for one you’d actually read.” The gallery itself is the evidence — open any of the 20 files in a browser; none of them require a build step.
  • 9 use-case sections × 20 demos total. Each section opens with a one-paragraph rule of thumb explaining when HTML beats markdown for that kind of work. The categories are: Exploration & Planning (3), Code Review & Understanding (3), Design (2), Prototyping (2), Illustrations & Diagrams (2), Decks (1), Research & Learning (2), Reports (2), Custom Editing Interfaces (3).
  • Spatial-information argument. “Diffs and call-graphs are spatial information; markdown flattens them.” The pitch is structural — anything with side-by-side comparison, dependency arrows, motion, timestamps, or interactive controls loses information when serialized as a single linear scroll.
  • One-file artifacts, no toolchain. Every demo is a single .html file. No bundler, no React framework, no asset pipeline — the agent emits one file you double-click in Finder. The footer underscores it: “Everything on this page is itself a single .html file.”
  • HTML as the design-system medium. The Design section makes a load-bearing point: “HTML is the medium your design system ships in, so it’s the natural format for talking about it. Tokens become swatches, components become contact sheets, and the artifact can be fed straight back into the next prompt.” HTML closes the loop between design discussion and design output.
  • Custom editors as a category. The most underused: ask the agent for a throwaway editing UI for the specific thing you’re working on (ticket triage board, feature-flag toggles, prompt-template tuner) and end with a “copy back as markdown / JSON / diff” button. You stay in the loop and the loop gets tighter.
  • HTML where motion is the point. “Motion and interaction can’t be described, only felt. A throwaway page with the real easing curve or the real click-through tells you in five seconds what a paragraph of prose never could.” Animation sandboxes and clickable flows are five-second decisions in HTML, paragraphs of speculation in markdown.
  • SVG as the agent’s pen. “Inline SVG gives the agent a real pen.” Diagrams, figure sheets, and flowcharts where the LLM produces vector art directly — no Mermaid handoff, no DOT export, no Excalidraw round-trip.
  • No claimed metrics, no benchmarks. This is an aesthetic and informational-density argument, not a benchmark study. The “unreasonable effectiveness” framing borrows from the Wigner essay tradition — the point is that a default behavior most people overlook turns out to dominate when measured by did the reader actually consume it.
  • Visual taste signature is itself the demo. Thariq’s masthead pairs an ivory/slate/clay/oat/olive palette + serif headline (italic emphasized in clay) + mono microtype against a hero figure that literally draws “markdown pane” (tilted grey lines) vs “HTML pane” (cards, bar chart). The page is its own first example.

The 9 categories — when to ask for HTML

CategoryDemosWhen HTML wins
Exploration & Planning3Fan out 2-3 approaches; place them side-by-side; let the operator point at one. Then turn the pick into an implementer-ready plan (timeline + data-flow + mockups + risk table).
Code Review & Understanding3Annotated diffs, PR writeups with the why, and module maps drawn as boxes-and-arrows. Diffs and call-graphs are spatial; markdown flattens them.
Design2Living design system (tokens → swatches, type scale → live samples) and component variant sheets. HTML is already the medium the design system ships in.
Prototyping2Animation sandboxes with tunable sliders; clickable 4-screen interaction flows. Motion can’t be described, only felt.
Illustrations & Diagrams2Inline SVG figure sheets for blog posts; annotated flowcharts where you can click any step. The agent’s actual pen.
Decks1Arrow-key navigable slide deck as one HTML file. No Keynote, no build step.
Research & Learning2Feature explainers (TL;DR + collapsible request-path steps + tabbed config snippets + FAQ); concept explainers with live interactive simulations (consistent-hashing ring, comparison table, hover-glossary).
Reports2Weekly status with a small chart; incident timelines with log excerpts and follow-up checklists. Recurring documents that benefit from structure.
Custom Editing Interfaces3Ticket triage board (drag across Now/Next/Later/Cut → copy out as markdown), feature-flag editor (dependency warnings + diff button), prompt tuner (template + live-render samples).

The 20 demos (verbatim filenames)

Exploration & Planning: 01-exploration-code-approaches.html, 02-exploration-visual-designs.html, 16-implementation-plan.html · Code Review: 03-code-review-pr.html, 17-pr-writeup.html, 04-code-understanding.html · Design: 05-design-system.html, 06-component-variants.html · Prototyping: 07-prototype-animation.html, 08-prototype-interaction.html · Diagrams: 10-svg-illustrations.html, 13-flowchart-diagram.html · Decks: 09-slide-deck.html · Research: 14-research-feature-explainer.html, 15-research-concept-explainer.html · Reports: 11-status-report.html, 12-incident-report.html · Editors: 18-editor-triage-board.html, 19-editor-feature-flags.html, 20-editor-prompt-tuner.html.

Each file is reachable as a relative link from the index page (e.g. https://thariqs.github.io/html-effectiveness/03-code-review-pr.html).

Implementation

Tool/Service: None — the gallery is a static GitHub Pages site. The artifact this article reflects on is a pattern for agent output, not a runnable tool. Apply it inside Claude Code, Claude Cowork, or Claude Design by asking for HTML where you would have asked for markdown.

Setup: Add to your CLAUDE.md or system prompt a rule like:

When the answer involves any of:
- side-by-side comparison of 2+ options,
- spatial relationships (diffs, call graphs, dependency maps),
- motion or interaction,
- design-system content (tokens, components, swatches),
- a recurring report (weekly status, post-mortem),
- a throwaway editing UI tied to my current task,

prefer a single self-contained .html file over a markdown wall.
Include inline CSS + SVG; no build step. End custom editors with a
"copy back as markdown / JSON" export button.

Cost: Output token count is comparable to equivalent markdown, so prompt economics barely move — but Thariq’s own framing (per the 2026-05-31 AI-Garage breakdown) is that HTML is roughly 2-4× slower to generate because the model emits more structural markup. He treats that as acceptable because you usually generate the artifact once and a few thousand extra tag tokens are noise against a million-token context window. The win is in operator-side time-to-decision and information density, not API economics.

Integration notes:

  • Pairs with Claude Design — Claude Design is the productized canvas version of this thesis. Use Claude Design when the artifact needs iteration through five input modes (chat, voice, hover, draw, screenshot); use raw HTML output when the operator just needs the file once and gone.
  • Pairs with frontend-slides — the Decks category here is one demo; frontend-slides is the 12-preset productized version of the same idea for Karpathy-style wiki decks.
  • Pairs with beautiful-html-templates — 30-template AGENTS.md gallery by zarazhangrui. Same shape (single-file HTML), broader visual range, opinionated 6-step shortlisting flow with 3-preview comparison.
  • Pairs with Impeccable v3 — Jay Yang’s 37-pattern AI-slop catalog tells the agent what NOT to do when generating HTML; this gallery tells it what TO do.
  • Anti-pattern guard rail. Don’t ask for HTML when the actual answer is short and linear (a one-line fix, a yes/no, a quick summary). The thesis is for information-dense output where structure is load-bearing. The wall of markdown is itself a tell — if the agent is about to write 2000 words with three side-by-side sections, intercept and ask for the HTML.

Try It

  • One-click test on your next code review. Next time you ask Claude to summarize a diff, append “render this as a single-file HTML annotated PR review with margin notes, severity tags, and jump links — exactly like thariqs.github.io/html-effectiveness/03-code-review-pr.html.” Compare reading time and decision speed against the markdown version.
  • Replace your next weekly status with 11-status-report.html. Ask Claude to take your week’s git log + Linear ticket close-outs + KPI snapshot and emit a single-file weekly status with a small inline chart. Skim it once and decide whether it’s faster than the markdown version you usually send.
  • The custom-editor recipe. Pick one task this week where you wish you had a 10-minute throwaway UI — sorting 30 backlog items, tuning a prompt template, picking among 5 visual directions for a hero. Ask Claude for it as a single-file HTML editor with an export button. The pattern from 18-editor-triage-board.html / 19-editor-feature-flags.html / 20-editor-prompt-tuner.html is reusable.
  • Pair with Impeccable v3’s /critique skill. Generate a candidate HTML artifact, then run it through the Impeccable 37-pattern catalog to catch AI-slop visual defaults (gradient overuse, soft-shadow Stripe-clone aesthetics, lazy emoji icons). The combination gets you better-looking artifacts faster than either tool alone.
  • For WEO Marketly client comms. Replace plain-markdown audit deliverables with HTML status reports (a single-file site audit, an incident timeline of a deployment regression, a competitive teardown rendered as a side-by-side comparison). The artifact is more polished, but the cost is the same prompt.
  • For the karpathy wiki publication layer. Most wiki articles want to stay markdown for editability + Quartz portability. But specific high-density artifacts — Claude Code surface comparisons, the weekly digests feature matrices, the claude-vision vs claude-video comparison table — could ship as <topic>/decks/<name>.html artifacts living next to the markdown. The companion-file pattern would extend slot.

Author Walkthrough — Thariq on “How I AI” with Claire Vo (2026-05-20 refresh)

[YouTube — Claire Vo / Thariq Shihipar 2026-05-20] Thariq’s own first-person walkthrough of the pattern on Claire Vo’s How I AI podcast (Qrpm7E80wQ0), recorded at Anthropic’s first developer conference Code with Claude. New framings the standalone gallery doesn’t articulate:

  • “HTML is the new Markdown” — explicit headline thesis. Markdown was the right substrate when Opus 4 and Claude 3.5 produced ~50-line plans you’d actually read. With Opus 4.5 / 4.7 running hour-long agentic plans that produce 1000-line markdown files, “I honestly have stopped reading them. And this was honestly a mistake.” HTML preserves the read-it-yourself loop at the new context scale.
  • Compute-allocator framing. “When you say, ‘Okay, Claude can run for 8 hours,’ what you’re really saying is Claude can spend like 500 bucks.” The spec/plan/PRD is the compute-allocation decision, not a coordination artifact. Plans matter MORE as the model gets stronger because the marginal $$ of every agentic loop goes up. Claire Vo’s response is the wiki-quotable line: “You said product management is dead. What’s next? You’re a compute allocator, babe. That’s the job now.”
  • The brainstorm-in-HTML opening move. Thariq’s canonical prompt for new work: “I’m on a Claire Vo podcast. I want to do a demo. Can you brainstorm me some ideas in an HTML file?” — produced eight visual demos with mockups, risk callouts, and “why” sections. The rule of thumb stated: “I’m not going to read a longer output than the screen on Claude Code.” HTML wins because diagrams + visuals + scrollable scaffolding survive long outputs; markdown collapses.
  • The plan-in-HTML follow-up. After picking demo #8 (CSV → interactive dashboards skill), Thariq prompted “Interview me about number 8”, answered questions, then “Create an HTML file as a plan that helps me visualize what the implementation plan is. Include excerpts, mockups, code, whatever is needed to give me maximum context” (note: “excerpts” misspelled in his real prompt — caveat from the live podcast that simple prompts work). Resulting plan included: scripted podcast script, file system, sample skill.md excerpt, mood board, example components, helper scripts. “This is something that I will actually read.”
  • Throwaway custom UIs on the plan itself — “micro-software on top of micro-software.” When the HTML plan had a section Thariq didn’t like (a rendering-rules table for CSV→viz), instead of editing it through Claude Code, he prompted: “Create an HTML artifact to help me define the decision rules. Design the ideal interface for this problem.” Result: a custom gamified UI with editable fields, hide/copy/add controls, and a “copy back as markdown” export button he could paste back into the plan. The framing: “This is not even personal software. It’s like sub… It’s like micro software on top of micro software.” Maps directly to 18-editor-triage-board.html / 19-editor-feature-flags.html / 20-editor-prompt-tuner.html from the original gallery.
  • HTML as living design system. Thariq’s pattern when starting a new app in the Anthropic design system: instead of design.md, ship designsystem.html — colors, typography, spacing, radius, core components all rendered as inline samples. Point Claude at any project folder with “Find a design system here. Create an HTML artifact and pass it around.” The compressed-understanding artifact then becomes the reference for every downstream session. Claire Vo’s complementary move: ask Claude Design to extract a design system from both marketing-site repo + app repo, generate at the component level so each component has its variations + states + interaction tweaks, drop into the repo, then drive marketing collateral (download transparent PNGs from the interactive page, drop in decks / Remotion videos). Headline: “Design.md is dead. Long live design.html.”
  • Use it for cross-team communication, not just specs. Thariq sends his weekly status update to his manager (Cat) as HTML. “I get Cat to read my Slack and just create this message. And she actually gets to read it, and I don’t have to spend that much time on it.” The new competitive frame: who is building the best product to represent themselves to their manager — same compute-allocator move, applied to upward communication.
  • “Give Claude an out” prompting style. Thariq’s stated prompting heuristic: give enough information to constrain the answer, then end with “whatever is needed to give me maximum context” — explicit permission to deviate from the constraints. “I trust you here. I want to just like be in the loop with you.” Pairs with Claire Vo’s signature “I believe in you and trust you!!!” closer.
  • HTML enables build-time verification. After the plan is HTML, you can have a verification agent compare “what did I intend to do” (from the plan’s mockup section) against the actual implementation output. Markdown can’t host the mockup; HTML can. Verification gets cheaper.
  • Just-in-time documentation as antidote to spec-source-of-truth-debates. Old PM struggle: where do specs live? what template? centralized repository? Thariq’s reframe — content cost ≈ 0, consumption cost ≈ 0, discovery cost ≈ 0 (the models find it). The argument moves up the stack: “Is it a good idea? Do we feel like it’s going to be executed well?” The throwaway high-quality artifact is now affordable.

The most operationally important addition for this wiki: the designsystem.html artifact as a passable reference object that compresses an entire design system into one file you can drop into any new Claude Code project context. Pairs with Claude Design’s “extract design system from GitHub repo” feature — Claude Design generates it, then designsystem.html is the portable artifact you can pass between sessions / share with marketers / hand to Remotion. This is the practical handoff the gallery hints at without naming.

Open Questions

  • Where is the companion blog post? The page eyebrow says “Companion to the blog post” but the HTML contains no link to the originating essay. The blog post itself was not fetched in this ingest — a follow-up fetch on Thariq’s primary site (x.com handle, personal blog) would close that loop. (The 2026-05-20 podcast appearance corroborates that the gallery is a companion to a longer blog post Thariq has been working on but doesn’t link it; his X handle is @trq212.)
  • Token-cost calibration. No measurement here on whether HTML output costs more output tokens than equivalent markdown. Anecdotally similar for the same information density, but worth a 5-run benchmark on a representative task (e.g. a weekly status report) to confirm.
  • When does HTML lose? The article argues for HTML in 9 categories but doesn’t articulate the anti-cases. Partially answered by the 2026-05-31 refresh — see § Karpathy’s Endorsement + Counterargument Ledger. Explicit anti-cases now on record: machine-only files (config / agent-only, no human reader), short linear answers, and cases where noisy git diffs outweigh the readability gain. Thariq concedes diff-noise is “the real cost.” Still open: a quantified threshold for when the 2-4× generation-time penalty stops being worth it.
  • Author scope. Thariq’s broader stack and prior work weren’t probed by this ingest. Whether the gallery is part of a larger LLM-engineering body of work is open. ^[inferred] Based on the GitHub Pages handle alone, no other context fetched.
  • No measurement of operator decision speed. The qualitative claim “a document you’d actually read” is the load-bearing argument but lacks A/B data. A small in-team experiment (5 operators, 1 markdown variant vs 1 HTML variant of the same content, measure time-to-decision) would either calibrate or refute the thesis.

Refresh — Anthropic first-party ratification (2026-05-22)

[Reddit signal — r/ClaudeCode 2026-05-22 post 1tkcci7, score 91 / 49 comments] Anthropic shipped a first-party blog post on the same thesis: claude.com/blog/using-claude-code-the-unreasonable-effectiveness-of-html — “The unreasonable effectiveness of HTML.” This is the official ratification of the pattern Thariq’s standalone gallery + Claire Vo podcast appearance previewed: HTML beats markdown for long agentic outputs because plans/specs/reports become “documents you’d actually read.” For wiki users, the practical implication is that ~/.claude/templates/html/ + the use-case → demo-number table in ~/.claude/CLAUDE.md § HTML-first agent output (the existing house-style infrastructure) now has Anthropic-side cover for being a default-on workflow, not a niche experiment. Source: raw/reddit-1tkcci7.md.

Karpathy’s Endorsement + Counterargument Ledger (2026-05-31 refresh)

[YouTube — AI Garage / onchainaigarage.com, A8ceddKFvdg] A creator breakdown of Thariq’s piece that adds two things the prior sections don’t: a Karpathy tweet ratifying the thesis, and an explicit objection-and-rebuttal ledger that answers this article’s own “When does HTML lose?” open question.

Karpathy’s tweet (the namesake ratification). Andrej Karpathy publicly agreed with Thariq the same week: “This works really well, by the way. At the end of your query, ask your LLM to structure responses in HTML, and view the generated file in your browser. Also had some success asking LLM to present its output as slides, etc.” He then enlarged the claim into a progression of output formats in increasing order of human-friendliness: (1) raw text → (2) Markdown (the current default) → (3) HTML (the emerging new default — graphics, layout, interactivity) → (4-6) further steps ending at “interactive neural video” (diffusion-generated, clickable). The wiki-load-bearing argument is the vision-bandwidth point: roughly a third of the human cortex is a massively parallel visual processor — “the highest-bandwidth pipe into the brain we have.” If you are optimizing for a human absorbing information, you should target the channel that can absorb at speed; walls of text can’t. This is the strongest first-principles case in the cluster for why HTML wins, and it comes from the wiki’s namesake. ^[inferred] (the AI-Garage video paraphrases the tweet; the verbatim tweet text was not separately fetched in this ingest)

The five arguments, named. The breakdown crystallizes Thariq’s case into five labeled arguments — useful as a checklist when deciding whether to ask for HTML: (1) information density — HTML’s element vocabulary (tables, inputs, CSS, images, video, embedded scripts) dwarfs markdown’s bullets-and-fenced-code ceiling, so a capable model stops improvising around the constraint with ASCII trees and shaded-unicode “color”; (2) visual clarity — tabs, sidebars, tables-of-contents, collapsible details, mobile-responsive layout make a 500-line spec finishable instead of a mindless scroll; (3) shareability — browsers don’t render markdown natively, so a markdown file forces a non-technical colleague to copy-paste into a Google Doc, screenshot it, or install a viewer, whereas an HTML file drops on any static host as a link anyone can open; (4) two-way interaction — sliders/toggles/drop-downs plus a “copy as prompt” button turn the artifact into the interface (the existing custom-editor category); (5) data ingestion — the agent in your terminal reads your filesystem, hits MCP servers (Slack, Linear), walks git history, and drives a browser, so the HTML page is grounded in your actual data, code, and conversations rather than a generic template.

The counterargument ledger (answers “When does HTML lose?”). The article’s Open Questions asked for the explicit anti-cases; the breakdown enumerates Thariq’s own concessions and rebuttals:

ObjectionThariq’s rebuttalVerdict
Costs more tokensA few thousand tag tokens are noise against million-token context windowsRebutted
Slower to produce~2-4× slower, but the artifact is dramatically more useful and you generate it onceRebutted
Harder to view (must open somewhere; markdown opens in any editor)Open locally or push to S3 for a shareable URL — neither is hardRebutted
Harder to edit (markdown you just open and tweak; HTML you must ask the agent to change, or hand-edit code)(acknowledged, no strong rebuttal)Partial
Noisy git diffs (tags/attributes churn)Conceded — “this one is the real cost.”Conceded

The single clean anti-case both the creator and Thariq agree on: machine-only files (configuration files, anything only an agent needs to read and no human will open) stay markdown — they’re trivially agent-readable and gain nothing from HTML. Combined with this article’s existing guardrail (don’t ask for HTML for short, linear, grep-bound answers), the “use markdown here” boundary is now explicit: config/agent-only files, short linear answers, and anything where noisy diffs outweigh the readability gain.

HTML as a launchpad, not a terminus. The creator’s worked example shows the escalation chain the gallery only hints at: the same MTP-vs-DeepSeek-Flash report rendered as (a) markdown (faster, but “still a lot of text” even in a proper VS Code preview, with broken ASCII diagrams), (b) single-file HTML (color-coded charts, clickable table of contents, side-by-side cards — ~2× slower but readable), then (c) an HTML slideshow with GPT Image 2 illustrations, then (d) a vertical short-form video via HyperFrames. The takeaway: HTML is the pivot format downstream production (decks, Remotion/HyperFrames video) is launched from — reinforcing the existing § Author Walkthrough point about designsystem.html as a passable reference object and pairing with HyperFrames for the video step.

The operationally useful net-new for this wiki: the counterargument ledger is now citable when an operator pushes back on HTML output, and the machine-only-files exception is the missing half of the article’s anti-pattern guardrail. Source: raw/HTML_vs._Markdown_-_Pushing_AI_Agent_Responses_to_the_Next_Level.md.