Source: Rendering — HeyGen HyperFrames docs
HyperFrames renders HTML compositions to video through a frame-by-frame, seek-driven pipeline: bundled Chromium (Puppeteer) captures each frame, then FFmpeg encodes. This is the render-and-ops deep-dive for the HeyGen Hyperframes cluster — output formats, the full render flag surface, the local-vs-Docker determinism tradeoff, GPU/worker/concurrency tuning, batch runs, and transparent-overlay output. For the broader CLI command set see HyperFrames Quickstart & CLI.
Key Takeaways
- Five formats from one command.
--formatchooses MP4 (H.264), MOV (ProRes 4444, alpha), WebM (VP9, alpha), GIF, or PNG sequence (RGBA, lossless). Default is MP4. - Local mode (default) is fast but non-reproducible. Bundled Chromium + system FFmpeg; output varies across machines due to font and Chrome-version differences. Not safe for CI.
- Docker mode (
--docker) is the determinism path. Pinned Chrome + font set + FFmpeg yield identical output on every platform, viachrome-headless-shell+ BeginFrame frame-perfect capture. Use it for CI/CD, team-shared renders, and AI-agent-driven rendering. - Quality is a CRF preset.
draft(CRF 28 / ultrafast),standard(CRF 18 / medium — visually lossless at 1080p),high(CRF 15 / slow). Override with--crfor--video-bitrate(mutually exclusive). - Two independent GPU surfaces.
--gpu= hardware FFmpeg encode (NVENC/VideoToolbox/AMF/VAAPI/QSV); browser GPU = local Chrome/WebGL capture (on locally, off in Docker;--no-browser-gputo opt out). - Parallelism has two knobs. Workers (one Chrome process per worker, ~256 MB each; default = half cores capped at 4) and a producer-server concurrency semaphore (
--max-concurrent-renders, default 2, FIFO queue with a pollableGET /render/queue). - Transparent overlays: MOV ProRes 4444 for editors, WebM VP9 for Chromium browsers only, PNG sequence for lossless compositing — and leave
html/bodybackground unset. - The render path is explicitly agent-friendly: determinism via Docker, batch preflight +
manifest.json, queue polling, and--json/SSE progress streaming.
Output formats
--format selects the container/codec (default mp4); --fps sets frame rate (24/30/60, default 30).
| Format | Codec | Alpha | Editors | Browsers | Size |
|---|---|---|---|---|---|
| MP4 | H.264 | No | All | All | Small |
| MOV | ProRes 4444 | Yes | CapCut, Final Cut, Premiere, DaVinci, After Effects | No | Large |
| WebM | VP9 | Yes | None (alpha → black) | Chrome, Firefox | Small |
| PNG sequence | RGBA PNGs (no encode) | Yes (lossless) | After Effects, Nuke, Fusion | No | Largest |
| GIF | palette | 1-bit only | — | All | Large |
- MP4 / H.264 — the universal default; smallest; no transparency.
- MOV / ProRes 4444 — the industry-standard transparent format; large files (5-40 MB for short clips) because ProRes is an editing intermediate, not a delivery codec.
- WebM / VP9 — small with alpha, but only Chromium browsers decode the alpha (editors and Safari show it black) — browser playback only.
- GIF — for autoplay inline in PRs/READMEs/docs; two-pass palette encode (
palettegen+paletteuse, Sierra dithering); capped at 30fps; no audio, 1-bit transparency; much larger than MP4/WebM, so keep comps short. - PNG sequence —
--outputis a directory of zero-paddedframe_NNNNNN.png; lossless; for AE/Nuke/Fusion or a custom encode.
--video-frame-format (auto/jpg/png) controls how source video layers are extracted before capture — use png for UI recordings, screen captures, or color-sensitive footage; auto uses PNG for alpha sources and JPG for opaque ones.
Local rendering
The default mode — bundled Chromium captures frames, system FFmpeg encodes:
npx hyperframes doctor # verify deps: Node 22, FFmpeg/FFprobe 7.x, bundled Chrome, Docker
npx hyperframes preview # eyeball the composition first
npx hyperframes render --output output.mp4A 240-frame 1080p clip renders in ~16s in the docs example (8.2s capture + 8.0s encode). Local mode has fast startup and can use the system GPU for both capture and (--gpu) encode — best for iteration. The cost: output varies across machines (fonts, Chrome version), so it is not safe for pipelines that require reproducibility.
Render flags
| Flag | Values | Default | Notes |
|---|---|---|---|
--output | path | renders/.mp4 | Output file path (a template in batch mode) |
--format | mp4, mov, webm, gif, png-sequence | mp4 | Container/codec |
--fps | 24, 30, 60 | 30 | Frame rate |
--gif-loop | 0-65535 | 0 | GIF loop count (0 = forever) |
--quality | draft, standard, high | standard | CRF preset |
--crf | 0–51 | — | Override CRF (lower = better). Excludes --video-bitrate |
--video-bitrate | e.g. 10M, 5000k | — | Target bitrate. Excludes --crf |
--video-frame-format | auto, jpg, png | auto | Source-video frame extraction |
--workers | 1-8 or auto | auto | Parallel capture workers |
--max-concurrent-renders | 1-10 | 2 | Producer-server simultaneous renders |
--batch | path | — | JSON rows → one output per row |
--batch-concurrency | integer | 1 | Batch rows rendered at once |
--batch-fail-fast | — | off | Stop after first row failure |
--gpu | — | off | Hardware FFmpeg encode |
--browser-gpu / --no-browser-gpu | — | on locally / off in Docker | Host GPU for Chrome/WebGL capture |
--hdr / --sdr | — | off | Force HDR/SDR output (HDR is MP4-only) |
--docker | — | off | Deterministic Docker render |
--quiet | — | off | Suppress verbose output |
Quality presets
| Preset | CRF | x264 preset | Best for |
|---|---|---|---|
draft | 28 | ultrafast | Quick previews, iteration |
standard | 18 | medium | General use — visually lossless at 1080p |
high | 15 | slow | Final delivery, near-lossless |
The default standard (CRF 18) is visually lossless at 1080p. For finer control use --crf or --video-bitrate — they override the preset and cannot be combined.
Workers (capture parallelism)
- Each worker is a separate Chrome process (~256 MB RAM + significant CPU).
- Default = half your cores, capped at 4 (M1 Air 8c → 4; M3 Pro 12c → 4 capped; 4c laptop → 2; 2c VM → 1).
- Use 1 worker for short comps (<2s / 60 frames), low-RAM machines (≤4 GB), or when running alongside other heavy jobs. Increase for 30s+ comps on 8+ core / 16 GB+ boxes, dedicated render machines, or CI.
Docker rendering (the deterministic path)
npx hyperframes render --docker --output output.mp4Docker mode is the determinism contract. It pins an exact Chrome version, font set, and FFmpeg so the same composition produces identical output on every platform. Under the hood it uses chrome-headless-shell with BeginFrame control for frame-perfect, deterministic capture, and the browser capture stays on the deterministic software-GL path (no host-GPU variance). The full mechanism is documented on the determinism concept page that this guide links — see HyperFrames Core Concepts.
When to pick Docker over local:
| Scenario | Mode |
|---|---|
| Local development / iteration | Local |
| Quick preview export | Local |
| Benchmarking performance | Local |
| CI/CD pipeline | Docker |
| Sharing renders with a team | Docker |
| AI agent-driven rendering | Docker |
Tradeoffs: slower startup (container init); GPU encode needs Docker-host GPU passthrough and is not cross-platform on Docker Desktop; browser GPU is off in Docker by design. Determinism is exactly what makes Docker the right default for subagent-driven and scheduled renders — same inputs, same MP4, no flaky replays.
GPU acceleration
Two independent surfaces:
--gpu— hardware video encoder in FFmpeg when available: VideoToolbox (macOS), NVENC (NVIDIA), AMD AMF (Windows), VAAPI (Linux), Intel QSV. Can meaningfully speed up encoding of many-frame comps.- Browser GPU — host GPU for local Chrome/WebGL capture; on automatically for local renders, off in Docker;
--no-browser-gputo opt out. Maps to Metal (macOS) / D3D11 (Windows) / EGL (Linux), and is local-mode only.
Use --no-browser-gpu or Docker when cross-machine reproducibility matters more than local speed.
Batch and concurrent renders (ops)
Batch — one render per data row, for templated/variant output:
npx hyperframes render --batch rows.json --output "renders/{name}.mp4" --strict-variables--output is a template ({index} or any scalar row key). HyperFrames preflights the whole batch — malformed rows, missing placeholders, duplicate output paths, and strict-variable mismatches fail before the first video. A manifest.json records per-row status, path, render time, duration, and errors. Rows continue past failures by default; --batch-fail-fast stops after the first; --json streams progress.
Concurrency — the producer server gates simultaneous renders (each spawns its own Chrome workers) with a request-level semaphore: only --max-concurrent-renders (default 2) run at once, the rest wait FIFO.
- Set via
--max-concurrent-renders Nor envPRODUCER_MAX_CONCURRENT_RENDERS. GET /render/queue→{maxConcurrentRenders, activeRenders, queuedRenders}; agents poll it to decide submit-vs-wait.POST /render/streamemits aqueuedevent ({"type":"queued","position":N}) so agents can report queue position instead of looking stuck.- Recommended limit by box: 4c → 1, 8c → 2, 16c → 3-4, 32c → 5-6. When unsure, use 1 (renders queue and each gets full CPU).
Transparent video (overlays)
For lower thirds, subscribe cards, and elements composited over other footage:
- MOV / ProRes 4444 (
--format mov) — works in every major editor; large files (expected, same tradeoff as Remotion). - WebM / VP9 alpha (
--format webm) — small, but only Chromium browsers decode the alpha; editors and Safari render it black. Browser playback only. - PNG sequence (
--format png-sequence) — a directory of RGBAframe_NNNNNN.png(+ anaudio.aacsidecar when the comp has audio); lossless for AE/Nuke/Fusion.
How alpha output works: HyperFrames captures each frame as a PNG with alpha (not JPEG), sets Chrome’s page background transparent via Emulation.setDefaultBackgroundColorOverride, then encodes with an alpha-capable codec (ProRes 4444 / VP9); png-sequence skips encoding. Your composition’s HTML must leave html/body background unset. Verify with a checkerboard page in Chrome, an editor track above other footage, or rotato.app/tools/transparent-video.
Audio
Audio handling is format-dependent at render time:
- GIF carries no audio (and only 1-bit transparency).
- PNG sequence has no container, so when the composition has audio HyperFrames writes a separate
audio.aacsidecar next to the frame PNGs. - No audio-mixing or level flags appear in the render options — audio is part of the composition, authored upstream, not a render-time control.
Cloud rendering
This guide covers local and Docker rendering. Its Next Steps reference a hosted option — “render on HeyGen’s hosted cloud — no local Chrome or FFmpeg” — but the cloud-render backend flags (HeyGen-hosted plus AWS Lambda deployment) are documented on the CLI page, not on this one. See HyperFrames Quickstart & CLI for the cloud and Lambda render surface.
Try It
- Run
npx hyperframes doctorto confirm Node 22 + FFmpeg/FFprobe 7.x + bundled Chrome + Docker, thennpx hyperframes render --output out.mp4for a baseline local render. - Re-render the same composition with
--dockerand diff the two MP4s to feel the local-vs-deterministic difference before wiring renders into CI or an agent. - Export a transparent lower third:
npx hyperframes render --format mov --output overlay.mov(leavehtml/bodybackground unset) and drop it on a track above footage in CapCut or Premiere. - For a shorts pipeline, drive
--batch rows.json --jsonand have your agent pollGET /render/queueto pace submissions. - Run
npx hyperframes benchmarkto find the worker/quality sweet spot for your machine.