HyperFrames Rendering

Source: Rendering — HeyGen HyperFrames docs

HyperFrames renders HTML compositions to video through a frame-by-frame, seek-driven pipeline: bundled Chromium (Puppeteer) captures each frame, then FFmpeg encodes. This is the render-and-ops deep-dive for the HeyGen Hyperframes cluster — output formats, the full render flag surface, the local-vs-Docker determinism tradeoff, GPU/worker/concurrency tuning, batch runs, and transparent-overlay output. For the broader CLI command set see HyperFrames Quickstart & CLI.

Key Takeaways

Five formats from one command. --format chooses MP4 (H.264), MOV (ProRes 4444, alpha), WebM (VP9, alpha), GIF, or PNG sequence (RGBA, lossless). Default is MP4.
Local mode (default) is fast but non-reproducible. Bundled Chromium + system FFmpeg; output varies across machines due to font and Chrome-version differences. Not safe for CI.
Docker mode (--docker) is the determinism path. Pinned Chrome + font set + FFmpeg yield identical output on every platform, via chrome-headless-shell + BeginFrame frame-perfect capture. Use it for CI/CD, team-shared renders, and AI-agent-driven rendering.
Quality is a CRF preset. draft (CRF 28 / ultrafast), standard (CRF 18 / medium — visually lossless at 1080p), high (CRF 15 / slow). Override with --crf or --video-bitrate (mutually exclusive).
Two independent GPU surfaces. --gpu = hardware FFmpeg encode (NVENC/VideoToolbox/AMF/VAAPI/QSV); browser GPU = local Chrome/WebGL capture (on locally, off in Docker; --no-browser-gpu to opt out).
Parallelism has two knobs. Workers (one Chrome process per worker, ~256 MB each; default = half cores capped at 4) and a producer-server concurrency semaphore (--max-concurrent-renders, default 2, FIFO queue with a pollable GET /render/queue).
Transparent overlays: MOV ProRes 4444 for editors, WebM VP9 for Chromium browsers only, PNG sequence for lossless compositing — and leave html/body background unset.
The render path is explicitly agent-friendly: determinism via Docker, batch preflight + manifest.json, queue polling, and --json/SSE progress streaming.

Output formats

--format selects the container/codec (default mp4); --fps sets frame rate (24/30/60, default 30).

Format	Codec	Alpha	Editors	Browsers	Size
MP4	H.264	No	All	All	Small
MOV	ProRes 4444	Yes	CapCut, Final Cut, Premiere, DaVinci, After Effects	No	Large
WebM	VP9	Yes	None (alpha → black)	Chrome, Firefox	Small
PNG sequence	RGBA PNGs (no encode)	Yes (lossless)	After Effects, Nuke, Fusion	No	Largest
GIF	palette	1-bit only	—	All	Large

MP4 / H.264 — the universal default; smallest; no transparency.
MOV / ProRes 4444 — the industry-standard transparent format; large files (5-40 MB for short clips) because ProRes is an editing intermediate, not a delivery codec.
WebM / VP9 — small with alpha, but only Chromium browsers decode the alpha (editors and Safari show it black) — browser playback only.
GIF — for autoplay inline in PRs/READMEs/docs; two-pass palette encode (palettegen + paletteuse, Sierra dithering); capped at 30fps; no audio, 1-bit transparency; much larger than MP4/WebM, so keep comps short.
PNG sequence — --output is a directory of zero-padded frame_NNNNNN.png; lossless; for AE/Nuke/Fusion or a custom encode.

--video-frame-format (auto/jpg/png) controls how source video layers are extracted before capture — use png for UI recordings, screen captures, or color-sensitive footage; auto uses PNG for alpha sources and JPG for opaque ones.

Local rendering

The default mode — bundled Chromium captures frames, system FFmpeg encodes:

npx hyperframes doctor    # verify deps: Node 22, FFmpeg/FFprobe 7.x, bundled Chrome, Docker
npx hyperframes preview   # eyeball the composition first
npx hyperframes render --output output.mp4

A 240-frame 1080p clip renders in ~16s in the docs example (8.2s capture + 8.0s encode). Local mode has fast startup and can use the system GPU for both capture and (--gpu) encode — best for iteration. The cost: output varies across machines (fonts, Chrome version), so it is not safe for pipelines that require reproducibility.

Render flags

Flag	Values	Default	Notes
`--output`	path	`renders/.mp4`	Output file path (a template in batch mode)
`--format`	mp4, mov, webm, gif, png-sequence	mp4	Container/codec
`--fps`	24, 30, 60	30	Frame rate
`--gif-loop`	0-65535	0	GIF loop count (`0` = forever)
`--quality`	draft, standard, high	standard	CRF preset
`--crf`	0–51	—	Override CRF (lower = better). Excludes `--video-bitrate`
`--video-bitrate`	e.g. `10M`, `5000k`	—	Target bitrate. Excludes `--crf`
`--video-frame-format`	auto, jpg, png	auto	Source-video frame extraction
`--workers`	1-8 or `auto`	auto	Parallel capture workers
`--max-concurrent-renders`	1-10	2	Producer-server simultaneous renders
`--batch`	path	—	JSON rows → one output per row
`--batch-concurrency`	integer	1	Batch rows rendered at once
`--batch-fail-fast`	—	off	Stop after first row failure
`--gpu`	—	off	Hardware FFmpeg encode
`--browser-gpu` / `--no-browser-gpu`	—	on locally / off in Docker	Host GPU for Chrome/WebGL capture
`--hdr` / `--sdr`	—	off	Force HDR/SDR output (HDR is MP4-only)
`--docker`	—	off	Deterministic Docker render
`--quiet`	—	off	Suppress verbose output

Quality presets

Preset	CRF	x264 preset	Best for
`draft`	28	ultrafast	Quick previews, iteration
`standard`	18	medium	General use — visually lossless at 1080p
`high`	15	slow	Final delivery, near-lossless

The default standard (CRF 18) is visually lossless at 1080p. For finer control use --crf or --video-bitrate — they override the preset and cannot be combined.

Workers (capture parallelism)

Each worker is a separate Chrome process (~256 MB RAM + significant CPU).
Default = half your cores, capped at 4 (M1 Air 8c → 4; M3 Pro 12c → 4 capped; 4c laptop → 2; 2c VM → 1).
Use 1 worker for short comps (<2s / 60 frames), low-RAM machines (≤4 GB), or when running alongside other heavy jobs. Increase for 30s+ comps on 8+ core / 16 GB+ boxes, dedicated render machines, or CI.

Docker rendering (the deterministic path)

npx hyperframes render --docker --output output.mp4

Docker mode is the determinism contract. It pins an exact Chrome version, font set, and FFmpeg so the same composition produces identical output on every platform. Under the hood it uses chrome-headless-shell with BeginFrame control for frame-perfect, deterministic capture, and the browser capture stays on the deterministic software-GL path (no host-GPU variance). The full mechanism is documented on the determinism concept page that this guide links — see HyperFrames Core Concepts.

When to pick Docker over local:

Scenario	Mode
Local development / iteration	Local
Quick preview export	Local
Benchmarking performance	Local
CI/CD pipeline	Docker
Sharing renders with a team	Docker
AI agent-driven rendering	Docker

Tradeoffs: slower startup (container init); GPU encode needs Docker-host GPU passthrough and is not cross-platform on Docker Desktop; browser GPU is off in Docker by design. Determinism is exactly what makes Docker the right default for subagent-driven and scheduled renders — same inputs, same MP4, no flaky replays.

GPU acceleration

Two independent surfaces:

--gpu — hardware video encoder in FFmpeg when available: VideoToolbox (macOS), NVENC (NVIDIA), AMD AMF (Windows), VAAPI (Linux), Intel QSV. Can meaningfully speed up encoding of many-frame comps.
Browser GPU — host GPU for local Chrome/WebGL capture; on automatically for local renders, off in Docker; --no-browser-gpu to opt out. Maps to Metal (macOS) / D3D11 (Windows) / EGL (Linux), and is local-mode only.

Use --no-browser-gpu or Docker when cross-machine reproducibility matters more than local speed.

Batch and concurrent renders (ops)

Batch — one render per data row, for templated/variant output:

npx hyperframes render --batch rows.json --output "renders/{name}.mp4" --strict-variables

--output is a template ({index} or any scalar row key). HyperFrames preflights the whole batch — malformed rows, missing placeholders, duplicate output paths, and strict-variable mismatches fail before the first video. A manifest.json records per-row status, path, render time, duration, and errors. Rows continue past failures by default; --batch-fail-fast stops after the first; --json streams progress.

Concurrency — the producer server gates simultaneous renders (each spawns its own Chrome workers) with a request-level semaphore: only --max-concurrent-renders (default 2) run at once, the rest wait FIFO.

Set via --max-concurrent-renders N or env PRODUCER_MAX_CONCURRENT_RENDERS.
GET /render/queue → {maxConcurrentRenders, activeRenders, queuedRenders}; agents poll it to decide submit-vs-wait.
POST /render/stream emits a queued event ({"type":"queued","position":N}) so agents can report queue position instead of looking stuck.
Recommended limit by box: 4c → 1, 8c → 2, 16c → 3-4, 32c → 5-6. When unsure, use 1 (renders queue and each gets full CPU).

Transparent video (overlays)

For lower thirds, subscribe cards, and elements composited over other footage:

MOV / ProRes 4444 (--format mov) — works in every major editor; large files (expected, same tradeoff as Remotion).
WebM / VP9 alpha (--format webm) — small, but only Chromium browsers decode the alpha; editors and Safari render it black. Browser playback only.
PNG sequence (--format png-sequence) — a directory of RGBA frame_NNNNNN.png (+ an audio.aac sidecar when the comp has audio); lossless for AE/Nuke/Fusion.

How alpha output works: HyperFrames captures each frame as a PNG with alpha (not JPEG), sets Chrome’s page background transparent via Emulation.setDefaultBackgroundColorOverride, then encodes with an alpha-capable codec (ProRes 4444 / VP9); png-sequence skips encoding. Your composition’s HTML must leave html/body background unset. Verify with a checkerboard page in Chrome, an editor track above other footage, or rotato.app/tools/transparent-video.

Audio

Audio handling is format-dependent at render time:

GIF carries no audio (and only 1-bit transparency).
PNG sequence has no container, so when the composition has audio HyperFrames writes a separate audio.aac sidecar next to the frame PNGs.
No audio-mixing or level flags appear in the render options — audio is part of the composition, authored upstream, not a render-time control.

Cloud rendering

This guide covers local and Docker rendering. Its Next Steps reference a hosted option — “render on HeyGen’s hosted cloud — no local Chrome or FFmpeg” — but the cloud-render backend flags (HeyGen-hosted plus AWS Lambda deployment) are documented on the CLI page, not on this one. See HyperFrames Quickstart & CLI for the cloud and Lambda render surface.

Try It

Run npx hyperframes doctor to confirm Node 22 + FFmpeg/FFprobe 7.x + bundled Chrome + Docker, then npx hyperframes render --output out.mp4 for a baseline local render.
Re-render the same composition with --docker and diff the two MP4s to feel the local-vs-deterministic difference before wiring renders into CI or an agent.
Export a transparent lower third: npx hyperframes render --format mov --output overlay.mov (leave html/body background unset) and drop it on a track above footage in CapCut or Premiere.
For a shorts pipeline, drive --batch rows.json --json and have your agent poll GET /render/queue to pace submissions.
Run npx hyperframes benchmark to find the worker/quality sweet spot for your machine.

Jonathon's AI Wiki

Explorer

HyperFrames Rendering

Key Takeaways

Output formats

Local rendering

Render flags

Quality presets

Workers (capture parallelism)

Docker rendering (the deterministic path)

GPU acceleration

Batch and concurrent renders (ops)

Transparent video (overlays)

Audio

Cloud rendering

Try It

Graph View

Table of Contents

Backlinks

Jonathon's AI Wiki

Explorer

HyperFrames Rendering

Key Takeaways

Output formats

Local rendering

Render flags

Quality presets

Workers (capture parallelism)

Docker rendering (the deterministic path)

GPU acceleration

Batch and concurrent renders (ops)

Transparent video (overlays)

Audio

Cloud rendering

Try It

Related

Graph View

Table of Contents

Backlinks