Source: Deploy, Cloud Rendering, AWS Lambda, GCP Cloud Run, Migrating to HyperFrames Lambda, Templates on Lambda — HeyGen HyperFrames docs

Once a HyperFrames composition renders cleanly on your laptop, the next question is where the renders run in production. HyperFrames offers four off-the-shelf paths beyond local rendering: HeyGen-hosted cloud rendering (zero infra, pay-per-credit), self-hosted AWS Lambda and GCP Cloud Run distributed rendering (bring your own cloud, chunked parallelism), and one-click preview+render-API templates for Vercel and Cloudflare. The render primitives are identical across all of them — only the storage, compute, and orchestration adapters differ — so the choice is about who owns the compute, not about output quality.

Key Takeaways

  • Four backends, one decision axis — who owns the compute. HeyGen Cloud (managed, per-credit), AWS Lambda / GCP Cloud Run (self-hosted distributed), and Vercel / Cloudflare templates (self-hosted single-endpoint behind a preview UI).
  • HeyGen Cloud = zero infra: hyperframes auth login then hyperframes cloud render — the CLI zips the project, uploads to /v3/assets, renders on HeyGen’s Chromium+FFmpeg infra, polls, and downloads. No local Chrome/FFmpeg/AWS. Bills per credit; 4k is 1.5×.
  • AWS Lambda = bring-your-own-AWS distributed: three commands (lambda deploy / render / destroy). One Lambda function fronts a Step Functions Standard workflow (Plan → Map(N) RenderChunk → Assemble) with artifacts in S3. Billed in GB-seconds + per-state-transition; lambda progress shows a running cost tally.
  • GCP Cloud Run mirrors Lambda but is simpler: a container image means the Chrome story is one Dockerfile line — no 250 MB ZIP ceiling, no @sparticuz/chromium decompression. Cloud Run gen2 gives a 60-min request timeout and 32 GiB memory (vs Lambda’s 15-min cap). Provisioned via a Terraform module, driven via the SDK.
  • Templates (Vercel / Cloudflare) = one-click preview + /api/render MP4 endpoint. Apache-2.0, open source. Both pre-bake the renderer (snapshot or OCI image) to skip the 30–60s Chromium/FFmpeg install per cold render. Best for “a single render endpoint behind a preview UI,” not queues/multi-tenant.
  • Distributed rendering is deterministic-only. Both Lambda and Cloud Run refuse data-gpu-mode="hardware" (software SwiftShader GL only) and fail-closed on missing fonts at plan time — byte-level reproducibility is required for per-chunk concat-copy. No HDR in distributed mode (in-process only). webm uses closed-GOP VP9 (~10-25% larger files).
  • Templates with variables work on every backend — declare data-composition-variables, then personalise per render. Lambda adds render-batch (JSONL fan-out) for personalised-video-at-scale; the Step Functions execution input is capped at 256 KiB (512 KiB on Cloud Run), so pass media as URL references, never inlined base64.
  • In-process vs distributed crossover: for a single render under ~30s the in-process renderer (hyperframes render) wins on latency (no S3 round-trip per chunk); distributed pays off above ~60s or for personalised batches.

HeyGen Cloud (hosted, zero-infra)

The managed path: nothing to deploy, no Chrome/FFmpeg/AWS to manage, billed per credit.

hyperframes auth login            # one-time sign-in
hyperframes cloud render          # zip → upload → render → download

Auth stores a credential at ~/.heygen/credentials (mode 0600), shared with the standalone heygen CLI. Browser OAuth 2.0 + PKCE by default; --api-key (or stdin pipe) for CI/headless. Resolution order (first match wins): HEYGEN_API_KEY env → HYPERFRAMES_API_KEY env → ~/.heygen/credentials. HEYGEN_API_URL overrides the backend (default https://api.heygen.com).

The pipeline (hyperframes cloud render): resolve the project (default ., or skip upload with --asset-id/--url) → auto-detect aspect ratio from data-width/data-height → zip (excludes .git, node_modules, dist) → POST /v3/assets (yields asset_id) → POST /v3/hyperframes/renders (yields render_id) → poll GET /v3/hyperframes/renders/{id} → download the signed URL.

Most-used flags:

hyperframes cloud render . \
  --composition compositions/intro.html \
  --output ./renders/intro.mp4
hyperframes cloud render --quality high --fps 60
FlagDefaultNotes
--fps301–240
--qualitystandarddraft / standard / high
--formatmp4mp4 / webm / mov (webm/mov carry alpha)
--resolution1080p1080p or 4k (4k billed 1.5×; cannot combine with webm/mov)
--aspect-ratioauto16:9 / 9:16 / 1:1; defaults to 16:9 for --asset-id/--url
--composition / -cindex.htmlEntry HTML inside the zip
--output / -orenders/<id>.<ext>Download destination

Variables / templates: --variables '{...}', --variables-file, --strict-variables. Idiomatic flow is upload once, re-render many — render a local project to get its asset_id, then re-submit against that asset with new variables (skips zip+upload). Fire-and-forget: --no-wait + --callback-url (HTTPS webhook on terminal state) + --callback-id; tune polling with --poll-interval (default 10s) and --max-wait (default 60min). Manage renders: cloud list, cloud get <id> (re-fetches short-lived presigned video_url/thumbnail_url), cloud delete <id>. Safe retries: the zip upload is not idempotent on its own (a blind retry double-bills) — pass --idempotency-key "$(uuidgen)".

AWS Lambda (self-host, distributed)

A first-class distributed deployment: one Lambda function fronts a Step Functions Standard workflow (Plan → Map(N) RenderChunk → Assemble) that fans renders across parallel chunk workers, with the plan tarball, per-chunk outputs, and final MP4 in S3. The handler is thin dispatch — it downloads inputs from S3 into /tmp, calls the OSS primitive in @hyperframes/producer/distributed, uploads back, returns small JSON. End-to-end is three commands:

hyperframes lambda deploy
hyperframes lambda render ./my-project --width 1920 --height 1080 --wait
hyperframes lambda destroy

Prerequisites: AWS credentials (any chain boto3 resolves), the AWS SAM CLI (lambda deploy/destroy shells out to sam deploy/sam delete), bun (builds handler.zip at deploy time), and a HyperFrames repo checkout (the ZIP is built from source — set HYPERFRAMES_REPO_ROOT to deploy outside one).

Three deployment paths:

  1. CLI (recommended) — thin wrapper over the SAM template + @hyperframes/aws-lambda SDK:

    hyperframes lambda deploy \
      --stack-name=hyperframes-prod \
      --region=us-east-1 \
      --concurrency=8 \
      --memory=10240

    The default --concurrency=8 is deliberately conservative — it caps worst-case spend on a runaway render at roughly 8 × (15 min × 10 GB × $0.0000167/GB-s) ≈ $1.20. Pre-stage a project to skip re-tar/re-upload on tight loops:

    hyperframes lambda sites create ./my-project   # → content-addressed Site ID (SHA-256 of the tree)
    hyperframes lambda render ./my-project --site-id=<id> --width 1920 --height 1080 --wait

    Re-running sites create on an unchanged tree skips the upload via a HeadObject short-circuit.

  2. Direct SAM — read/customise the CloudFormation at examples/aws-lambda/template.yaml:

    cd packages/aws-lambda && bun run build:zip        # produces dist/handler.zip
    cd ../../examples/aws-lambda
    sam deploy --stack-name=hyperframes-prod --region=us-east-1 \
      --resolve-s3 --capabilities CAPABILITY_IAM --no-confirm-changeset \
      --parameter-overrides ChromeSource=sparticuz ReservedConcurrency=8

    Warning: SAM’s own ReservedConcurrency default is -1 (unreserved); the CLI overrides to 8. Set it explicitly here or you get the account-default.

  3. CDK construct@hyperframes/aws-lambda/cdk exports a HyperframesRenderStack L2 construct emitting the same topology (aws-cdk-lib + constructs are optional peer deps). Exposes .bucket, .renderFunction, .stateMachine.

CloudFormation outputs you need to invoke renders: RenderBucketName, RenderStateMachineArn, RenderFunctionArn.

IAM bootstrap (avoids the iam:CreateRole first-deploy trap):

hyperframes lambda policies user                          # inline policy for the CLI's IAM user
hyperframes lambda policies role --principal=cloudformation  # { TrustRelationship, InlinePolicy }
hyperframes lambda policies validate ./infra/iam/hf.json  # exit non-zero on missing perms (CI gate)

Generated docs grant Resource: "*"; narrow to the deployed ARNs after the first successful deploy.

Cost shape: GB-seconds (billed duration × memory) + a tiny per-state-transition fee for SFN Standard. lambda progress exposes the tally:

hyperframes lambda progress my-render-id
# Status:    SUCCEEDED
# Cost:      $0.0214 (Lambda $0.0210 + SFN $0.0004)
# Output:    s3://hyperframes-renders/.../output.mp4

The cost figure is best-effort (Lambda billed duration from the handler’s own DurationMs; S3 transfer excluded) but matches AWS Billing within rounding noise — the math is in packages/aws-lambda/src/sdk/costAccounting.ts.

Troubleshooting (typed errors short-circuit the state machine):

  • Stack already exists → reuse the same --stack-name (SAM is idempotent).
  • User is not authorized to perform iam:CreateRole → attach the policies user output.
  • PLAN_HASH_MISMATCH → producer version drift between local plan() and the deployed ZIP; re-run lambda deploy (always rebuilds) and re-render.
  • BROWSER_GPU_NOT_SOFTWARE → a non-SwiftShader GL backend was detected; rebuild the ZIP (bun run --cwd packages/aws-lambda build:zip) and redeploy. The pipeline pins @sparticuz/chromium + --use-gl=swiftshader --use-angle=swiftshader.
  • Stuck at RUNNING → usually a cold-start chain; reserved concurrency batches the chunks. Check the SFN execution if no progress for >10 min.
  • Teardown leaves S3 storage → the bucket is created with CloudFormation Retain (intentional, to protect finished MP4s); empty/delete it manually via aws s3 rb.

Not in v1: completion webhooks (poll instead), a compositions discovery verb, multi-region failover (each --region is an independent stack), and HDR (SDR-only).

Templates on Lambda (personalised-video-at-scale)

Templates render on the standard Lambda stack — no template-only deployment. A template is a composition whose root <html> declares data-composition-variables; the composition reads runtime values via the window.__hyperframes.getVariables() global (use a plain <script>, not type="module", so the runtime is initialised). Declarable types: string, number, color, boolean, enum — the parser silently drops anything else (e.g. "object"), so serialise structured data into a string and parse it back inside the composition.

The full loop:

# 1. Iterate locally (no deploy needed)
hyperframes render --variables '{"title":"Hello Alice","accentColor":"#ff0000"}' --output renders/alice.mp4
# 2. Deploy once per account/region
hyperframes lambda deploy
# 3. Upload the site once → content-addressed siteId
hyperframes lambda sites create ./my-template
# 4a. Single personalised render
hyperframes lambda render ./my-template --site-id <id> --width 1920 --height 1080 \
  --variables '{"title":"Hello Alice"}' --output-key renders/alice.mp4 --wait
# 4b. Batch fan-out from JSONL (the headline)
hyperframes lambda render-batch ./my-template --batch ./users.jsonl \
  --width 1920 --height 1080 --max-concurrent 5

Each JSONL line is {"outputKey": "...", "variables": {...}}. render-batch takes no --variables-file (per-entry payloads are the point), caps simultaneous StartExecution calls at --max-concurrent (default 50), supports --json (pipe to jq) and --dry-run (status would-invoke, lints without paying). The same surface is available programmatically via deploySite + renderToLambda from @hyperframes/aws-lambda/sdk (HYPERFRAMES_BUCKET + HYPERFRAMES_SFN_ARN come from the stack outputs); wrap large batches in a semaphore.

Large-variables convention: the SFN Standard execution input is capped at 256 KiB for the entire payload (Express caps at 32 KiB; HyperFrames uses Standard for execution-history visibility). The SDK validates size client-side and rejects oversize inputs before any AWS call. Variables are for typed data; media assets are URL references the composition resolves at render time — never inline base64. Cost knobs: --max-parallel-chunks (default 16, per-render fan-out), --target-chunk-frames (optional per-chunk frame ceiling so one chunk can’t run past Lambda’s 15-min cap; ignored when --chunk-size is set), lambda deploy --concurrency (reserved Lambda concurrency), render-batch --max-concurrent (orchestrator-side), lambda deploy --memory (default/max 10 240 MB). Note: if deploy concurrency is 8 but maxParallelChunks stays at 16, even a single render throttles — bump deploy concurrency before large batches.

GCP Cloud Run (self-host, distributed)

Mirrors the Lambda deployment — a single Cloud Run service fronts a Cloud Workflows definition (Plan → parallel(for chunk) RenderChunk → Assemble) with artifacts in Google Cloud Storage. Each workflow step POSTs to the same Cloud Run URL with a different Action; OIDC-authenticated. The right choice for teams already on Google Cloud.

Why it’s simpler than Lambda: Cloud Run runs a container image, so the Chrome story is one Dockerfile line — no 250 MB ZIP ceiling, no @sparticuz/chromium runtime decompression, no packaging probe. It installs the same pinned chrome-headless-shell build as the production renderer. Cloud Run gen2 also gives more headroom: a 60-minute request timeout and 32 GiB memory (vs Lambda’s 15-min cap).

Deploy via the Terraform module at packages/gcp-cloud-run/terraform (provisions the GCS bucket, Cloud Run service, Cloud Workflows definition, two least-privilege service accounts, and a runaway-request alert):

# 1. Build + push the render image.
gcloud builds submit . \
  --tag us-central1-docker.pkg.dev/PROJECT/hyperframes/hyperframes-render:v1
# 2. Apply the module.
cd node_modules/@hyperframes/gcp-cloud-run/terraform
terraform init
terraform apply \
  -var project_id=PROJECT -var region=us-central1 \
  -var image=us-central1-docker.pkg.dev/PROJECT/hyperframes/hyperframes-render:v1

Terraform outputs render_bucket_name, service_url, workflow_name, region — feed those into the SDK. The GCP project must have billing enabled (Cloud Run, Cloud Workflows, Artifact Registry, Cloud Build are all billed).

Render via @hyperframes/gcp-cloud-run/sdkrenderToCloudRun({ projectDir, config, bucketName, projectId, location, workflowId, serviceUrl }) returns a handle; poll getRenderProgress({ executionName }) until status !== "running", then read progress.outputFile and progress.costs.displayCost. Variables work via config.variables; the Cloud Workflows execution argument is capped at 512 KiB, so URL your media. An end-to-end smoke script at examples/gcp-cloud-run/scripts/smoke.sh builds, applies, renders a fixture at multiple chunk sizes, PSNR-compares against the in-process baseline, and tears down. Supported formats: mp4 (H.264/H.265), mov (ProRes), webm (VP9), png-sequence; no HDR in distributed mode.

One-click templates (Vercel & Cloudflare)

Two official, open-source (Apache-2.0) templates wrap a composition in a small web app: an in-browser preview (<hyperframes-player>) plus a POST /api/render endpoint that produces an MP4 server-side and uploads it to object storage.

TemplateComputeStorage
VercelVercel Sandbox (Firecracker microVM, up to 5h / 8 vCPU)Vercel Blob (BLOB_READ_WRITE_TOKEN injected)
CloudflareCloudflare Containers (Workers + Durable Object, up to 4 vCPU / 12 GiB)R2 (hyperframes-renders, free egress in-network)

Both render on standard-4 and use --workers auto (three parallel Chrome workers). The key cost-saver is pre-baking the renderer — installing Chromium libs + chrome-headless-shell takes 30–60s, which would dominate every cold render. Vercel snapshots the sandbox at next build (restores in ~100 ms); Cloudflare bakes everything into the OCI image. Both deliberately avoid the hosted serverless primitives (Vercel Functions cap at 300s / 50 MB; Cloudflare Browser Rendering can’t install FFmpeg).

Choose Vercel if you already deploy on Vercel, want zero-config Blob, or reuse Vercel CI/preview envs (needs Vercel Pro). Choose Cloudflare if you’re on Workers, want R2’s free egress, or full control over the renderer image (needs Workers Paid). Container instances sleep after 10 min idle (cold-start on the next request).

Swap the composition: author locally (npx hyperframes init my-video && npx hyperframes preview), copy into public/compositions/<name>/, point the template at it (Vercel: PREVIEW_COMPOSITION_DIR in lib/preview.ts + dimensions in app/page.tsx; Cloudflare: PREVIEW_COMPOSITION_DIR env on npm run deploy or scripts/build.mjs), then vercel deploy / npm run deploy. Templates are best for a single render endpoint behind a preview UI — for render queues (retries/dedup/priorities), multi-tenant per-user uploads, or fully self-hosted, start from a template and extend, or run hyperframes render --docker on your own infra.

Migrating from another serverless renderer

The HyperFrames Lambda surface is shaped to match other one-command serverless renderers (notably @remotion/lambda), so the deploy/render/progress/destroy muscle memory transfers; spend the migration on what actually differs.

Concept mapping: one-shot deploy → hyperframes lambda deploy (builds the ZIP + sam deploy, idempotent); site upload → lambda sites create (content-addressed, skips unchanged trees); trigger render → lambda render ... --width --height (returns a renderId; --wait streams); poll → lambda progress <renderId> (cost included); teardown → lambda destroy (bucket Retain’d); IAM → lambda policies user/role/validate.

Composition format is the big change: instead of JSX compiled at render time, compositions are plain HTML with data-duration/data-width/data-height/data-fps on the root — what you write is what the browser renders (animation via first-party GSAP / Anime.js / CSS keyframes / Lottie / Three.js / Web Animations adapters). Render config maps directly: --fps (24/30/60 only on Lambda; NTSC rationals are in-process-only), --width/--height (even integers ≤ 7680), --codec=h264/h265, --format=mp4/mov/webm/png-sequence, --quality, --chunk-size (240), --max-parallel-chunks (16), --target-chunk-frames, --bitrate/--crf (mutually exclusive).

Variables ≈ inputProps (isomorphic JSON render-time overrides): Composition.defaultPropsdata-composition-variables; useCurrentFrame() + props.<x>__hyperframes.getVariables().<x> (read once on DOMContentLoaded); renderMediaOnLambda({ inputProps })renderToLambda({ config: { variables } }); same 256 KiB cap and same URL-your-assets convention — so a working inputProps pipeline is a CLI/SDK swap, not a payload reshape.

What HyperFrames does differently: hardware GPU mode is refused in distributed mode (BROWSER_GPU_NOT_SOFTWARE); font fetching fails closed (FONT_FETCH_FAILED at plan time — declare fonts via <link>/@fontsource/*); no HDR yet (in-process only); webm uses closed-GOP VP9 (alt-ref disabled so concat-copy round-trips losslessly, yuva420p for alpha, Opus audio, ~10-25% larger files; speed via PRODUCER_VP9_CPU_USED); stack state files are local (<repo>/.hyperframes/lambda-stack-<name>.json — symlink or pass --stack-name for shared CI); IAM is print-then-narrow.

Migration checklist: (1) inventory compositions, filter out HDR; (2) translate each to plain HTML (npx skills add heygen-com/hyperframes teaches Claude/Cursor/Codex the conventions); (3) wire into the build (no external bundler — npx hyperframes preview runs against HTML directly); (4) deploy to a staging stack (--stack-name=hyperframes-staging), real render with --wait, verify bytes; (5) add policies validate as a CI PR gate; (6) cut over, keep the old deployment for a release cycle, then lambda destroy staging.

Non-Lambda runtimes: the same @hyperframes/producer/distributed primitives run anywhere Node 22 + Chrome + ffmpeg + S3 are available. A reference Dockerfile lives at examples/k8s-jobs/Dockerfile.example for Cloud Run Jobs, Azure Container Apps Jobs, ECS Fargate, Kubernetes/Argo Jobs, or plain Docker on a VM — build it yourself (no published registry image).

Try It

  1. Fastest path, no infra: hyperframes auth login, then hyperframes cloud render in any composition project — HeyGen runs Chromium+FFmpeg and downloads the MP4. Add --idempotency-key "$(uuidgen)" for safe retries.
  2. One-click hosted endpoint: deploy the Vercel or Cloudflare template (one button), drop your composition into public/compositions/<name>/, point the config at it, and vercel deploy / npm run deploy to get a preview UI + /api/render.
  3. Self-host distributed on AWS: with AWS creds + SAM + bun + a repo checkout, run hyperframes lambda deploy then hyperframes lambda render ./my-project --width 1920 --height 1080 --wait; check lambda policies user first if iam:CreateRole fails.
  4. Personalised batch: declare data-composition-variables, hyperframes lambda sites create ./my-template, author a JSONL of {outputKey, variables} rows, then hyperframes lambda render-batch ./my-template --batch ./users.jsonl --dry-run to lint before paying.
  5. On Google Cloud: gcloud builds submit the render image, terraform apply the @hyperframes/gcp-cloud-run module, then drive renders with renderToCloudRun + getRenderProgress.

Open Questions

  • Exact per-credit pricing for HeyGen cloud rendering and the 1.5× 4k multiplier are not stated in the docs (HeyGen billing is external to these pages). ^[inferred]
  • The cloud-render --fps range (1–240) is wider than the Lambda surface’s documented 24/30/60-only constraint; the docs don’t reconcile whether arbitrary cloud FPS is fully supported end-to-end. ^[inferred]