HTML Is the Canvas — One Design Skillset, Three Output Media

Source: wiki synthesis: HeyGen HyperFrames, Beautiful HTML Templates, Frontend Design Deep Dive

HTML stopped being just a web-page format. It has become the shared authoring medium for three output media — web pages, slide decks, and video — and the same design skillset produces all three. A landing page, a deck slide, and a rendered MP4 frame are all HTML + CSS underneath, so the frontend-design skill’s rules (modular type scales, oklch color, CSS Grid, motion easing, the AI Slop Test) govern each identically. HyperFrames renders video from HTML; Beautiful HTML Templates ships decks as HTML; frontend-design is the aesthetic discipline for any HTML. The one seam — named by HyperFrames itself — is that design skills are spatial, and video needs a temporal layer bolted on top.

Key Takeaways

One canvas, three outputs. HyperFrames’ tagline is literally “Write HTML. Render video. Built for agents.” Beautiful HTML Templates is a 30-template library of decks-as-HTML. frontend-design is the master skill that decides how any HTML looks. Same source language, three deliverables.
HTML won because it’s the LLM’s native visual language. HyperFrames’ origin story is a JSON/XML video-model dead-end (“agents have no visual intelligence over JSON”) → the pivot to HTML. HTML is portable, readable, and diffable — a non-engineer can review a composition line-by-line, unlike a React tree or a proprietary timeline.
One design system feeds all three media. HyperFrames’ design.md → frame.md tooling takes a brand’s webpage design guide and reformats it for video — first-party proof that the same brand system spans two media. A template’s :root CSS variables are the slide-deck instance of the same closed-system idea.
The seam is spatial vs temporal aesthetics. HyperFrames’ builder thesis: LLMs are trained on spatial (web-layout) aesthetics but under-trained on temporal (video-timing) aesthetics — which is why “PPT videos” fail as launch videos. Spatial design skills transfer through HTML; the temporal layer (timing, easing, entrance choreography) is the added cost of the video output.
Anti-slop discipline transfers too. frontend-design’s AI Slop Test (“would someone immediately believe AI made this?”) applies to a video frame as much as a web hero — the banned patterns (default fonts, purple-on-dark, equal card grids) don’t stop being tells because the HTML is destined for MP4. ^[inferred]
This is the authoring/skill-transfer story, distinct from two neighbors. Not The Edit Is Text (edit-as-code mechanics for cutting footage) and not the anti-slop taxonomy (the offender catalog) — this is HTML as the authoring surface and one design skillset producing three media.

Three outputs, one source language

The three articles describe what look like three unrelated tools, but they stack into a single HTML-authoring column:

Article	Output medium	What it contributes
Frontend Design	any HTML	the master design discipline — directions, type/color/layout/motion rules, banned-pattern DON’T lists, the AI Slop Test
Beautiful HTML Templates	slide decks	30 closed visual systems in HTML (fonts, `:root` palette, layout grid, decorative vocabulary), with a strict preserve/replace contract
HyperFrames	video (MP4)	plain HTML with `data-*` timing attributes → rendered via headless Chrome + FFmpeg; multiple animation runtimes via the Frame Adapter

frontend-design is the layer that governs how any of these look: commit to one bold aesthetic direction, use a modular type scale with clamp(), tint neutrals, build palettes on the 60/30/10 rule, use CSS Grid for expressive 2D layout, and pass the AI Slop Test. Its own scope names “web components, pages, artifacts, posters, or applications” — and a deck slide or a video frame is exactly such an HTML artifact, so the same rules carry across. ^[inferred — the skill is described as UI-focused; extending it explicitly to decks and video frames is this article’s bridge]

Beautiful HTML Templates is the deck instance of the same design-system-in-HTML idea. Each template is a “closed visual system” — Google Fonts, a :root CSS-variable palette, a layout grid, a decorative vocabulary — and the agent must preserve those tokens while replacing only content; a missing layout must be “designed in-system,” never grafted from another template. That is a design system expressed as HTML, and it is the same author (zarazhangrui) whose Frontend Slides skill produces single-file HTML decks.

HyperFrames proves the surprising third medium: the video composition is HTML. “Write HTML. Render video.” — data-start / data-duration attributes, no proprietary DSL, no React components, rendered to MP4. Every design decision that would style a web page (type, color, layout) styles the frame.

One design system, three media

The strongest evidence that this is one skillset and not three is that a single brand system already feeds multiple media in first-party tooling:

design.md → frame.md (from the HyperFrames builder interview): design.md is the brand guideline for webpages; frame.md is the same guideline reformatted for video — maximize the frame, larger elements, motion. It’s now first-party tooling at hyperframes.dev/design, and the builders call it the key brand-matching input. One brand system, deliberately projected onto two media.
Template :root variables (Beautiful HTML Templates): the deck’s fonts and palette live in CSS variables the agent must keep intact — the same “define the system once, replace only content” move, scoped to slides.
frontend-design’s committed direction is the throughline: pick one bold aesthetic and let it govern every decision. Whether the HTML becomes a page, a slide, or a frame, the direction, the banned patterns, and the type/color system are the same. ^[inferred]

The payoff is brand consistency by construction: define the design tokens once and ship a landing page, a pitch deck, and a promo video that visibly belong to the same brand — because they were authored from the same HTML/CSS system. ^[inferred]

The seam: spatial skills transfer, temporal skills don’t (yet)

The transfer is not free, and HyperFrames is unusually candid about why. Its builders’ spatial-vs-temporal-aesthetics thesis: LLMs are trained on spatial (web-layout) aesthetics but under-trained on temporal (video-timing) aesthetics — the reason “PPT videos” fail as launch videos, and why HeyGen builds internal temporal evals and self-check loops (open-sourced into the skills) and is “working with frontier labs to train for it.”

So the video output needs a layer the web and deck outputs don’t:

A timing/easing vocabulary. HyperFrames adds natural-language easing (smooth, snappy, bouncy, dramatic → GSAP power2.out, back.out, expo.out) and timing shorthand (fast 0.2s / cinematic 1–2s) — a temporal design language on top of the spatial one.
Motion rules the skill enforces — entrance animations on every scene, scene-to-scene transitions, synchronous timeline construction — that have no equivalent in a static page.
frontend-design’s own Motion section (exponential easing, 150–300ms micro-interactions, 500–800ms entrances, transform/opacity over layout properties) is the closest bridge the spatial skillset already offers into that temporal layer — but HyperFrames’ timeline/keyframe model goes further than any static-UI motion guidance. ^[inferred]

Put simply: HTML carries your spatial taste into video intact; the temporal taste is the new thing you (or the tool’s evals) have to supply.

Why this is one skillset, not three

Same CSS, same type/color/layout discipline, same anti-slop gate, same agent-authoring contract (an AGENTS.md or an installed skill telling the agent the rules before it writes a line). Learn HTML design once, and you can ship three media from it. Two nearby connection articles cover adjacent ground and are worth not duplicating:

Not The Edit Is Text. That article is about edit-as-code mechanics — transcripts, EDLs, cheap re-transcribe verification — for cutting existing footage. This one is about HTML as the authoring medium for generating designed content, and about design-skill transfer. HyperFrames appears in both, but there it’s the render target for an edit loop; here it’s proof that design-HTML becomes video. ^[inferred]
Not the anti-slop defense layers. That article is the slop taxonomy (prescribe → generate → detect, the convergence table of offenders). Here, anti-slop is just one of the shared design disciplines that rides HTML across all three media, not the subject itself.

Try It

Author one brand system, render three media. Write a minimal design.md (fonts, palette, one bold direction per frontend-design). Build a landing section from it, a deck cover from a Beautiful HTML Templates template with those tokens, and a 10s HyperFrames intro via design.md → frame.md. Confirm they read as one brand.
Run the AI Slop Test on a video frame. Screenshot a rendered HyperFrames still and apply frontend-design’s gate — default font, purple-on-dark, or centered symmetry are tells in an MP4 frame just as on a web page.
Feel the spatial/temporal seam directly. Take a static HTML layout you like and ask an agent to animate it as a HyperFrames scene. Where it looks like a “moving PowerPoint,” that’s the temporal-aesthetics gap the builders name — fix it with the easing/timing vocabulary, not more layout work.
Reuse, don’t re-typeset. Point the agent at a prior deck template or a HyperFrames launch-video component library and pull a named element into a new medium — the closed-system tokens travel; only the content changes.

HeyGen HyperFrames — video authored as HTML; the design.md→frame.md tooling and the spatial-vs-temporal thesis.
Beautiful HTML Templates — decks as closed HTML design systems; the preserve/replace and design-in-system contract.
Frontend Design Deep Dive — the master design discipline governing any HTML output.
Frontend Slides — same author’s single-file HTML deck skill; the deck sibling to the template library.
The Edit Is Text — the adjacent edit-as-code mechanics story (cutting footage), distinct from this authoring/skill-transfer one.
Anti-AI-Slop Defense Layers — the slop taxonomy; here anti-slop is one shared discipline, not the subject.
Remotion Motion Graphics — the React-based motion alternative HyperFrames positions HTML against.
Open Design — agent-consumable design systems (DESIGN.md) spanning web and decks; the multi-medium design-system pattern.

Open Questions

How much of the spatial→temporal gap closes with model training vs tooling? HyperFrames is “working with frontier labs to train for” temporal aesthetics; whether video-timing taste becomes a native model skill (like layout) or stays a tool-supplied eval layer is unresolved.
Does the same design system round-trip losslessly across all three media? design.md→frame.md is a one-way reformat; no source demonstrates a single token set edited once and reflected automatically in web + deck + video without per-medium reformatting. ^[inferred]
Where does the AI Slop Test need temporal extensions? frontend-design’s banned patterns are spatial; the video analog (generic transitions, default-timed reveals, “PPT motion”) is implied by HyperFrames’ evals but not codified as a slop checklist anywhere in the sources. ^[inferred]

Jonathon's AI Wiki

Explorer

HTML Is the Canvas — One Design Skillset, Three Output Media

Key Takeaways

Three outputs, one source language

One design system, three media

The seam: spatial skills transfer, temporal skills don’t (yet)

Why this is one skillset, not three

Try It

Open Questions

Graph View

Table of Contents

Backlinks

Jonathon's AI Wiki

Explorer

HTML Is the Canvas — One Design Skillset, Three Output Media

Key Takeaways

Three outputs, one source language

One design system, three media

The seam: spatial skills transfer, temporal skills don’t (yet)

Why this is one skillset, not three

Try It

Related

Open Questions

Graph View

Table of Contents

Backlinks