Source: Digitalsamba Claude Code Video Toolkit 2026 04 17

Repo: https://github.com/digitalsamba/claude-code-video-toolkit Stars: 890 License: MIT Author: Conal Mullan (Digital Samba)

An open-source AI-native video production workspace for Claude Code. Ships 10 Claude Code skills, 13 slash commands, templates, brand profiles, a transitions library, a project-management system, and 8 cloud-GPU tools — all designed to let Claude Code orchestrate end-to-end video production. Built around open-source models (Qwen3-TTS, FLUX.2, LTX-2, ACE-Step, SadTalker) deployed on the user’s own Modal or RunPod account. Typical cost: $1–2/month.

Key Takeaways

  • Claude Code-native, not just Claude-adjacent. .claude/skills/ and .claude/commands/ are first-class. The toolkit IS a Claude Code project you open and work inside.
  • Open-source model stack. Qwen3-TTS (voice), FLUX.2 (image), LTX-2.3 22B (video), ACE-Step (music), SadTalker (talking head). Deployed to your own cloud account — cost stays with you, not Digital Samba.
  • Modal is the recommended GPU layer. 1–2/month. RunPod supported as alternative.
  • 10 Claude Code skills preloaded: remotion, elevenlabs, ffmpeg, playwright-recording, frontend-design, qwen-edit, acestep, ltx2, moviepy, runpod. This matches the “bundled skills in a repo” distribution pattern from plugins-and-marketplaces and Building Agents with Skills.
  • 13 slash commands orchestrate the workflow — /setup, /video, /scene-review, /design, /brand, /template, /record-demo, /generate-voiceover, /redub, /voice-clone, /versions, /skills, /contribute. Each is a high-level operator-visible action.
  • Multi-session project lifecycle is a first-class primitive. Projects move through planning → assets → review → audio → editing → rendering → complete. project.json tracks scenes, audio, sessions, and phase. A per-project CLAUDE.md gives instant context on resume.
  • Filesystem reconciliation. The system reconciles intent (planned scenes) with reality (files on disk). This is the pattern skill authors should steal — explicit plan vs. discovered state.
  • Brand profiles as distinct layer. brands/my-brand/ with brand.json (colors/fonts), voice.json (ElevenLabs voice), assets/ (logo/backgrounds). /video auto-applies brand to new projects.
  • Pre-built templates. sprint-review, sprint-review-v2 (composable scene-based), product-demo (dark tech, stats, CTA).
  • Custom transitions library — 7 custom effects (glitch, rgbSplit, zoomBlur, lightLeak, clockWipe, pixelate, checkerboard) + 4 official Remotion transitions. Preview via showcase/transitions.
  • Cost profile (per cloud-GPU tool): qwen3_tts ~0.02, image_edit ~0.01, music_gen free-to-0.10, ltx2 ~0.10. Transparent, not hidden behind SaaS pricing.
  • 890 stars, April 2026 active. Not dead; actively iterating per author’s “plan to keep iterating.”

Positioning vs HeyGen Studio Automation

HeyGen Studio Automation and Digital Samba’s Claude Code Video Toolkit are the same architectural pattern (Claude Code orchestrating video production) with different cost profiles:

DimensionHeyGen Studio AutomationClaude Code Video Toolkit
Primary modelsHeyGen Avatar V (commercial)Qwen3-TTS, FLUX.2, LTX-2, ACE-Step (OSS)
InfrastructureHeyGen SaaS + ElevenLabsYour Modal/RunPod account
Avatar approachAvatar V (video-reference)SadTalker (portrait + audio)
Voice approachElevenLabsQwen3-TTS or ElevenLabs
Primary costHeyGen subscriptionCloud GPU ($1–2/month typical)
Lock-inHeyGen-dependentSwap models freely
Skill bundleCustom10 Claude Code skills preloaded

Both build on Remotion for composition. Both use Claude Code as the orchestrator. The toolkit’s main leverage is OSS-model flexibility and very low cost ceiling.

The 10-skill bundle decoded

These are the Claude Code skills preloaded in .claude/skills/:

SkillWhat Claude Code gains
remotionDeep knowledge of React video framework, compositions, animations, rendering
elevenlabsTTS + voice cloning + music + SFX workflows
ffmpegFormat conversion, compression, resizing
playwright-recordingBrowser automation — record demos as video
frontend-designDistinctive, production-grade aesthetic refinement (matches Frontend Design Deep Dive)
qwen-editAI image editing prompting patterns
acestepAI music generation (prompts, lyrics, scene presets, video integration)
ltx2AI video generation (text-to-video, image-to-video, prompting guide)
moviepyPython video composition (overlay text on LTX-2/SadTalker, build.py projects)
runpodCloud GPU setup, Docker images, endpoint management, costs

Per Building Agents with Skills: these are a good-sized “bundle of skills for a vertical” example — mix of foundational (ffmpeg, moviepy), partner-style (elevenlabs, runpod), and purpose-built (acestep, ltx2, qwen-edit). Follow the progressive-disclosure pattern — metadata always loaded, body loaded only when triggered.

The 13 command set as a workflow

Rough ordering for a first project:

  1. /setup — one-time cloud GPU + storage + voice configuration
  2. /brand — define visual identity (optional; ships with default and digital-samba)
  3. /video — start or resume a project with a chosen template
  4. /record-demo — capture browser interactions
  5. /scene-review — visualize in Remotion Studio
  6. /design — iterate on visuals with the frontend-design skill
  7. /generate-voiceover — TTS pass
  8. npm run studio — live preview
  9. npm run render — final MP4 out

Utility commands outside the core flow:

  • /redub — rerun voiceover with different voice on existing video
  • /voice-clone — capture + persist a cloned voice into a brand
  • /versions — dependency/update check
  • /template — create/edit templates
  • /skills — inspect/create Claude Code skills
  • /contribute — issue / PR / example contribution path

Cloud GPU economics

ToolTypical cost per run
qwen3_tts$0.01
flux2$0.02
image_edit$0.03
upscale$0.01
music_gen (acemusic)Free
music_gen (self-hosted)$0.05
sadtalker$0.10
ltx2$0.23
dewatermark$0.10

For a 5-minute explainer: voiceover (0.08) + 2 music tracks (0.69) ≈ **under 30–$100/month per seat.

Example videos shipped

Toolkit evolution timeline (visible at demos.digitalsamba.com):

DateVideoWhat’s demonstrated
2025-12-05sprint-review-cho-oyuiOS sprint review with demos
2025-12-10digital-samba-skill-demoProduct demo showcasing Claude Code skill
2026-01-22ds-remote-mcpRemote MCP server demo
2026-01-25schlumbergeraAndroid sprint review
2026-02-23cortinaMobile platforms sprint review
2026-03-15the-space-betweenflux2 avatar + Qwen3-TTS + SadTalker essay
2026-04-08q2-townhall-longarm-adSuper Bowl–style launch with LTX-2 animated cameo
2026-04-08q2-townhall-starsGitHub star history time-lapse

Implementation

  • Tool/Service: Claude Code Video Toolkit (Digital Samba)
  • Setup:
    1. git clone https://github.com/digitalsamba/claude-code-video-toolkit.git && cd claude-code-video-toolkit
    2. python3 -m pip install -r tools/requirements.txt
    3. claude (opens Claude Code in the toolkit)
    4. In Claude Code: /setup (cloud GPU + storage + voice, ~5 min)
    5. /video to create your first project
  • Requirements: Node.js 18+, Claude Code, Python 3.9+ recommended, FFmpeg optional
  • Cost: Free (MIT license). Cloud GPU via Modal (recommended, 1–2/month.
  • Integration notes:
    • Ships with complete .claude/ directory — skills and commands are installed by cloning the repo.
    • Brand profiles plug into /video; new projects auto-apply selected brand.
    • project.json and per-project CLAUDE.md enable instant context resume across Claude Code sessions.
    • Can be used as a reference architecture for other Claude Code-native vertical toolkits. See Cross-Topic Connections for the broader agent-primitive landscape this fits into.

Open Questions

  • LTX-2 vs alternatives. LTX-2.3 22B at $0.23/generation is the toolkit’s built-in video generator. How does quality compare against Higgsfield’s Kling v2.1 Pro or Bytedance Seedance? Both are newer; drop-in swap appears possible.
  • Plugin-registry distribution. Would this work as a single installable plugin bundle per Plugins and Marketplaces, or is its file layout too opinionated for the plugin format?
  • Skill portability. The 10 skills are in .claude/skills/. Could they be extracted and published as a standalone skill pack usable outside the toolkit’s project structure?
  • Agent Skills open-standard fit. Per Building Agents with Skills, Agent Skills is becoming an open standard. Are these skills compliant with the emerging spec, or do they rely on Claude Code-specific conventions?
  • Commercial avatar fallback. SadTalker is the toolkit’s avatar path. For polished customer-facing output, is there a recommended path to swap in HeyGen Avatar V?
  • Voice-cloning legality. /voice-clone captures and stores a cloned voice. What’s the toolkit’s stated posture on consent and use restrictions? Not explicitly covered in the README.

Try It

  1. Zero-setup smoke test. cd examples/hello-world && npm install && npm run render. No API keys needed — produces an MP4 immediately. Confirms Remotion rendering works on your machine.
  2. Full /setup pass. Clone the repo, open in Claude Code, run /setup. Deploy the cloud GPU tools to Modal. Measure total setup time and compute budget used on your Modal account.
  3. Clone + customize one template. Copy templates/sprint-review-v2/ to your own folder, run /video against it, generate a 60-second sample using the default brand. Teaches the template pattern.
  4. Author one brand profile. Build brands/my-org/ with your real colors, fonts, and an ElevenLabs voice ID. Re-run /video — the new brand should apply automatically. This teaches the brand-layer pattern other tools don’t have.
  5. Swap one OSS model for a commercial one. Replace the SadTalker talking-head step with HeyGen Avatar V calls in a fork. Measures how portable the architecture is vs. locked to the bundled stack.
  6. Read the per-project CLAUDE.md. Inspect the auto-generated CLAUDE.md in a running project. It’s a good reference for how to structure your own project-level Claude context.
  7. Compare cost per minute of finished video against HeyGen Studio Automation for the same content. Hard data for the OSS-vs-SaaS decision.