Source: Digitalsamba Claude Code Video Toolkit 2026 04 17
Repo: https://github.com/digitalsamba/claude-code-video-toolkit Stars: 890 License: MIT Author: Conal Mullan (Digital Samba)
An open-source AI-native video production workspace for Claude Code. Ships 10 Claude Code skills, 13 slash commands, templates, brand profiles, a transitions library, a project-management system, and 8 cloud-GPU tools — all designed to let Claude Code orchestrate end-to-end video production. Built around open-source models (Qwen3-TTS, FLUX.2, LTX-2, ACE-Step, SadTalker) deployed on the user’s own Modal or RunPod account. Typical cost: $1–2/month.
Key Takeaways
- Claude Code-native, not just Claude-adjacent.
.claude/skills/and.claude/commands/are first-class. The toolkit IS a Claude Code project you open and work inside. - Open-source model stack. Qwen3-TTS (voice), FLUX.2 (image), LTX-2.3 22B (video), ACE-Step (music), SadTalker (talking head). Deployed to your own cloud account — cost stays with you, not Digital Samba.
- Modal is the recommended GPU layer. 1–2/month. RunPod supported as alternative.
- 10 Claude Code skills preloaded:
remotion,elevenlabs,ffmpeg,playwright-recording,frontend-design,qwen-edit,acestep,ltx2,moviepy,runpod. This matches the “bundled skills in a repo” distribution pattern from plugins-and-marketplaces and Building Agents with Skills. - 13 slash commands orchestrate the workflow —
/setup,/video,/scene-review,/design,/brand,/template,/record-demo,/generate-voiceover,/redub,/voice-clone,/versions,/skills,/contribute. Each is a high-level operator-visible action. - Multi-session project lifecycle is a first-class primitive. Projects move through
planning → assets → review → audio → editing → rendering → complete.project.jsontracks scenes, audio, sessions, and phase. A per-projectCLAUDE.mdgives instant context on resume. - Filesystem reconciliation. The system reconciles intent (planned scenes) with reality (files on disk). This is the pattern skill authors should steal — explicit plan vs. discovered state.
- Brand profiles as distinct layer.
brands/my-brand/withbrand.json(colors/fonts),voice.json(ElevenLabs voice),assets/(logo/backgrounds)./videoauto-applies brand to new projects. - Pre-built templates.
sprint-review,sprint-review-v2(composable scene-based),product-demo(dark tech, stats, CTA). - Custom transitions library — 7 custom effects (
glitch,rgbSplit,zoomBlur,lightLeak,clockWipe,pixelate,checkerboard) + 4 official Remotion transitions. Preview viashowcase/transitions. - Cost profile (per cloud-GPU tool): qwen3_tts ~0.02, image_edit ~0.01, music_gen free-to-0.10, ltx2 ~0.10. Transparent, not hidden behind SaaS pricing.
- 890 stars, April 2026 active. Not dead; actively iterating per author’s “plan to keep iterating.”
Positioning vs HeyGen Studio Automation
HeyGen Studio Automation and Digital Samba’s Claude Code Video Toolkit are the same architectural pattern (Claude Code orchestrating video production) with different cost profiles:
| Dimension | HeyGen Studio Automation | Claude Code Video Toolkit |
|---|---|---|
| Primary models | HeyGen Avatar V (commercial) | Qwen3-TTS, FLUX.2, LTX-2, ACE-Step (OSS) |
| Infrastructure | HeyGen SaaS + ElevenLabs | Your Modal/RunPod account |
| Avatar approach | Avatar V (video-reference) | SadTalker (portrait + audio) |
| Voice approach | ElevenLabs | Qwen3-TTS or ElevenLabs |
| Primary cost | HeyGen subscription | Cloud GPU ($1–2/month typical) |
| Lock-in | HeyGen-dependent | Swap models freely |
| Skill bundle | Custom | 10 Claude Code skills preloaded |
Both build on Remotion for composition. Both use Claude Code as the orchestrator. The toolkit’s main leverage is OSS-model flexibility and very low cost ceiling.
The 10-skill bundle decoded
These are the Claude Code skills preloaded in .claude/skills/:
| Skill | What Claude Code gains |
|---|---|
remotion | Deep knowledge of React video framework, compositions, animations, rendering |
elevenlabs | TTS + voice cloning + music + SFX workflows |
ffmpeg | Format conversion, compression, resizing |
playwright-recording | Browser automation — record demos as video |
frontend-design | Distinctive, production-grade aesthetic refinement (matches Frontend Design Deep Dive) |
qwen-edit | AI image editing prompting patterns |
acestep | AI music generation (prompts, lyrics, scene presets, video integration) |
ltx2 | AI video generation (text-to-video, image-to-video, prompting guide) |
moviepy | Python video composition (overlay text on LTX-2/SadTalker, build.py projects) |
runpod | Cloud GPU setup, Docker images, endpoint management, costs |
Per Building Agents with Skills: these are a good-sized “bundle of skills for a vertical” example — mix of foundational (ffmpeg, moviepy), partner-style (elevenlabs, runpod), and purpose-built (acestep, ltx2, qwen-edit). Follow the progressive-disclosure pattern — metadata always loaded, body loaded only when triggered.
The 13 command set as a workflow
Rough ordering for a first project:
/setup— one-time cloud GPU + storage + voice configuration/brand— define visual identity (optional; ships withdefaultanddigital-samba)/video— start or resume a project with a chosen template/record-demo— capture browser interactions/scene-review— visualize in Remotion Studio/design— iterate on visuals with the frontend-design skill/generate-voiceover— TTS passnpm run studio— live previewnpm run render— final MP4 out
Utility commands outside the core flow:
/redub— rerun voiceover with different voice on existing video/voice-clone— capture + persist a cloned voice into a brand/versions— dependency/update check/template— create/edit templates/skills— inspect/create Claude Code skills/contribute— issue / PR / example contribution path
Cloud GPU economics
| Tool | Typical cost per run |
|---|---|
| qwen3_tts | $0.01 |
| flux2 | $0.02 |
| image_edit | $0.03 |
| upscale | $0.01 |
| music_gen (acemusic) | Free |
| music_gen (self-hosted) | $0.05 |
| sadtalker | $0.10 |
| ltx2 | $0.23 |
| dewatermark | $0.10 |
For a 5-minute explainer: voiceover (0.08) + 2 music tracks (0.69) ≈ **under 30–$100/month per seat.
Example videos shipped
Toolkit evolution timeline (visible at demos.digitalsamba.com):
| Date | Video | What’s demonstrated |
|---|---|---|
| 2025-12-05 | sprint-review-cho-oyu | iOS sprint review with demos |
| 2025-12-10 | digital-samba-skill-demo | Product demo showcasing Claude Code skill |
| 2026-01-22 | ds-remote-mcp | Remote MCP server demo |
| 2026-01-25 | schlumbergera | Android sprint review |
| 2026-02-23 | cortina | Mobile platforms sprint review |
| 2026-03-15 | the-space-between | flux2 avatar + Qwen3-TTS + SadTalker essay |
| 2026-04-08 | q2-townhall-longarm-ad | Super Bowl–style launch with LTX-2 animated cameo |
| 2026-04-08 | q2-townhall-stars | GitHub star history time-lapse |
Implementation
- Tool/Service: Claude Code Video Toolkit (Digital Samba)
- Setup:
git clone https://github.com/digitalsamba/claude-code-video-toolkit.git && cd claude-code-video-toolkitpython3 -m pip install -r tools/requirements.txtclaude(opens Claude Code in the toolkit)- In Claude Code:
/setup(cloud GPU + storage + voice, ~5 min) /videoto create your first project
- Requirements: Node.js 18+, Claude Code, Python 3.9+ recommended, FFmpeg optional
- Cost: Free (MIT license). Cloud GPU via Modal (recommended, 1–2/month.
- Integration notes:
- Ships with complete
.claude/directory — skills and commands are installed by cloning the repo. - Brand profiles plug into
/video; new projects auto-apply selected brand. project.jsonand per-projectCLAUDE.mdenable instant context resume across Claude Code sessions.- Can be used as a reference architecture for other Claude Code-native vertical toolkits. See Cross-Topic Connections for the broader agent-primitive landscape this fits into.
- Ships with complete
Related
- HeyGen Studio Automation — the commercial-stack analog of this toolkit’s architectural pattern
- video-use (browser-use) — narrower scope (editing-by-conversation only, hosted ASR as the only paid dep) vs. this toolkit’s full OSS model stack. Overlapping goal: Claude Code owns video production.
- Remotion Motion Graphics — underlying video framework both toolkits share
- HeyGen Hyperframes — alternative HTML-composition framework with its own skill bundle
- HeyGen Avatar V — commercial alternative to SadTalker in this toolkit
- Higgsfield Overview — another API-first video-generation layer; could slot into this toolkit as an LTX-2 alternative
- Higgsfield Image-to-Video — Higgsfield’s image-to-video models parallel LTX-2 here
- Claude Code Routines — schedule toolkit commands to run unattended
- Building Agents with Skills — the 10-skill bundle here is a concrete instance of the pattern
- Plugins and Marketplaces — this toolkit is a candidate for plugin-registry distribution
- Frontend Design Deep Dive — the
frontend-designskill bundled here - awesome-design-md — DESIGN.md files could plug into the
brands/system to extend brand styling beyond colors/fonts - Banned AI Patterns — quality bar applicable to output from this toolkit
- Cross-Topic Connections — fits the broader Claude Code automation landscape
Open Questions
- LTX-2 vs alternatives. LTX-2.3 22B at $0.23/generation is the toolkit’s built-in video generator. How does quality compare against Higgsfield’s Kling v2.1 Pro or Bytedance Seedance? Both are newer; drop-in swap appears possible.
- Plugin-registry distribution. Would this work as a single installable plugin bundle per Plugins and Marketplaces, or is its file layout too opinionated for the plugin format?
- Skill portability. The 10 skills are in
.claude/skills/. Could they be extracted and published as a standalone skill pack usable outside the toolkit’s project structure? - Agent Skills open-standard fit. Per Building Agents with Skills, Agent Skills is becoming an open standard. Are these skills compliant with the emerging spec, or do they rely on Claude Code-specific conventions?
- Commercial avatar fallback. SadTalker is the toolkit’s avatar path. For polished customer-facing output, is there a recommended path to swap in HeyGen Avatar V?
- Voice-cloning legality.
/voice-clonecaptures and stores a cloned voice. What’s the toolkit’s stated posture on consent and use restrictions? Not explicitly covered in the README.
Try It
- Zero-setup smoke test.
cd examples/hello-world && npm install && npm run render. No API keys needed — produces an MP4 immediately. Confirms Remotion rendering works on your machine. - Full
/setuppass. Clone the repo, open in Claude Code, run/setup. Deploy the cloud GPU tools to Modal. Measure total setup time and compute budget used on your Modal account. - Clone + customize one template. Copy
templates/sprint-review-v2/to your own folder, run/videoagainst it, generate a 60-second sample using thedefaultbrand. Teaches the template pattern. - Author one brand profile. Build
brands/my-org/with your real colors, fonts, and an ElevenLabs voice ID. Re-run/video— the new brand should apply automatically. This teaches the brand-layer pattern other tools don’t have. - Swap one OSS model for a commercial one. Replace the SadTalker talking-head step with HeyGen Avatar V calls in a fork. Measures how portable the architecture is vs. locked to the bundled stack.
- Read the per-project
CLAUDE.md. Inspect the auto-generatedCLAUDE.mdin a running project. It’s a good reference for how to structure your own project-level Claude context. - Compare cost per minute of finished video against HeyGen Studio Automation for the same content. Hard data for the OSS-vs-SaaS decision.