Claude Code Video Toolkit (Digital Samba)

Source: Digitalsamba Claude Code Video Toolkit 2026 04 17

Repo: https://github.com/digitalsamba/claude-code-video-toolkit Stars: 890 License: MIT Author: Conal Mullan (Digital Samba)

An open-source AI-native video production workspace for Claude Code. Ships 10 Claude Code skills, 13 slash commands, templates, brand profiles, a transitions library, a project-management system, and 8 cloud-GPU tools — all designed to let Claude Code orchestrate end-to-end video production. Built around open-source models (Qwen3-TTS, FLUX.2, LTX-2, ACE-Step, SadTalker) deployed on the user’s own Modal or RunPod account. Typical cost: $1–2/month.

Key Takeaways

Claude Code-native, not just Claude-adjacent. .claude/skills/ and .claude/commands/ are first-class. The toolkit IS a Claude Code project you open and work inside.
Open-source model stack. Qwen3-TTS (voice), FLUX.2 (image), LTX-2.3 22B (video), ACE-Step (music), SadTalker (talking head). Deployed to your own cloud account — cost stays with you, not Digital Samba.
Modal is the recommended GPU layer. $30/ m o n t h f reeo n St a r t er pl an, m os t u sers l an d a t$ 1–2/month. RunPod supported as alternative.
10 Claude Code skills preloaded: remotion, elevenlabs, ffmpeg, playwright-recording, frontend-design, qwen-edit, acestep, ltx2, moviepy, runpod. This matches the “bundled skills in a repo” distribution pattern from plugins-and-marketplaces and Building Agents with Skills.
13 slash commands orchestrate the workflow — /setup, /video, /scene-review, /design, /brand, /template, /record-demo, /generate-voiceover, /redub, /voice-clone, /versions, /skills, /contribute. Each is a high-level operator-visible action.
Multi-session project lifecycle is a first-class primitive. Projects move through planning → assets → review → audio → editing → rendering → complete. project.json tracks scenes, audio, sessions, and phase. A per-project CLAUDE.md gives instant context on resume.
Filesystem reconciliation. The system reconciles intent (planned scenes) with reality (files on disk). This is the pattern skill authors should steal — explicit plan vs. discovered state.
Brand profiles as distinct layer. brands/my-brand/ with brand.json (colors/fonts), voice.json (ElevenLabs voice), assets/ (logo/backgrounds). /video auto-applies brand to new projects.
Pre-built templates. sprint-review, sprint-review-v2 (composable scene-based), product-demo (dark tech, stats, CTA).
Custom transitions library — 7 custom effects (glitch, rgbSplit, zoomBlur, lightLeak, clockWipe, pixelate, checkerboard) + 4 official Remotion transitions. Preview via showcase/transitions.
Cost profile (per cloud-GPU tool): qwen3_tts ~ $0.01, f l ux 2$ 0.02, image_edit ~ $0.03, u p sc a l e$ 0.01, music_gen free-to- $0.05, s a d t a l k er$ 0.10, ltx2 ~ $0.23, d e w a t er ma r k$ 0.10. Transparent, not hidden behind SaaS pricing.
890 stars, April 2026 active. Not dead; actively iterating per author’s “plan to keep iterating.”

Positioning vs HeyGen Studio Automation

HeyGen Studio Automation and Digital Samba’s Claude Code Video Toolkit are the same architectural pattern (Claude Code orchestrating video production) with different cost profiles:

Dimension	HeyGen Studio Automation	Claude Code Video Toolkit
Primary models	HeyGen Avatar V (commercial)	Qwen3-TTS, FLUX.2, LTX-2, ACE-Step (OSS)
Infrastructure	HeyGen SaaS + ElevenLabs	Your Modal/RunPod account
Avatar approach	Avatar V (video-reference)	SadTalker (portrait + audio)
Voice approach	ElevenLabs	Qwen3-TTS or ElevenLabs
Primary cost	HeyGen subscription	Cloud GPU ($1–2/month typical)
Lock-in	HeyGen-dependent	Swap models freely
Skill bundle	Custom	10 Claude Code skills preloaded

Both build on Remotion for composition. Both use Claude Code as the orchestrator. The toolkit’s main leverage is OSS-model flexibility and very low cost ceiling.

The 10-skill bundle decoded

These are the Claude Code skills preloaded in .claude/skills/:

Skill	What Claude Code gains
`remotion`	Deep knowledge of React video framework, compositions, animations, rendering
`elevenlabs`	TTS + voice cloning + music + SFX workflows
`ffmpeg`	Format conversion, compression, resizing
`playwright-recording`	Browser automation — record demos as video
`frontend-design`	Distinctive, production-grade aesthetic refinement (matches Frontend Design Deep Dive)
`qwen-edit`	AI image editing prompting patterns
`acestep`	AI music generation (prompts, lyrics, scene presets, video integration)
`ltx2`	AI video generation (text-to-video, image-to-video, prompting guide)
`moviepy`	Python video composition (overlay text on LTX-2/SadTalker, build.py projects)
`runpod`	Cloud GPU setup, Docker images, endpoint management, costs

Per Building Agents with Skills: these are a good-sized “bundle of skills for a vertical” example — mix of foundational (ffmpeg, moviepy), partner-style (elevenlabs, runpod), and purpose-built (acestep, ltx2, qwen-edit). Follow the progressive-disclosure pattern — metadata always loaded, body loaded only when triggered.

The 13 command set as a workflow

Rough ordering for a first project:

/setup — one-time cloud GPU + storage + voice configuration
/brand — define visual identity (optional; ships with default and digital-samba)
/video — start or resume a project with a chosen template
/record-demo — capture browser interactions
/scene-review — visualize in Remotion Studio
/design — iterate on visuals with the frontend-design skill
/generate-voiceover — TTS pass
npm run studio — live preview
npm run render — final MP4 out

Utility commands outside the core flow:

/redub — rerun voiceover with different voice on existing video
/voice-clone — capture + persist a cloned voice into a brand
/versions — dependency/update check
/template — create/edit templates
/skills — inspect/create Claude Code skills
/contribute — issue / PR / example contribution path

Cloud GPU economics

Tool	Typical cost per run
qwen3_tts	$0.01
flux2	$0.02
image_edit	$0.03
upscale	$0.01
music_gen (acemusic)	Free
music_gen (self-hosted)	$0.05
sadtalker	$0.10
ltx2	$0.23
dewatermark	$0.10

For a 5-minute explainer: voiceover ( $0.01) + 4 ima g es ($ 0.08) + 2 music tracks ( $0.10) + 3 L TX - 2 c l i p s ($ 0.69) ≈ **under $1 in c l o u d GP U * * . C o m p a re t oco mm erc ia lS aa S v i d eo - p ro d u c t i o n s t a c k s t ha tt y p i c a ll yr u n$ 30–$100/month per seat.

Example videos shipped

Toolkit evolution timeline (visible at demos.digitalsamba.com):

Date	Video	What’s demonstrated
2025-12-05	sprint-review-cho-oyu	iOS sprint review with demos
2025-12-10	digital-samba-skill-demo	Product demo showcasing Claude Code skill
2026-01-22	ds-remote-mcp	Remote MCP server demo
2026-01-25	schlumbergera	Android sprint review
2026-02-23	cortina	Mobile platforms sprint review
2026-03-15	the-space-between	flux2 avatar + Qwen3-TTS + SadTalker essay
2026-04-08	q2-townhall-longarm-ad	Super Bowl–style launch with LTX-2 animated cameo
2026-04-08	q2-townhall-stars	GitHub star history time-lapse

Implementation

Tool/Service: Claude Code Video Toolkit (Digital Samba)
Setup:
1. git clone https://github.com/digitalsamba/claude-code-video-toolkit.git && cd claude-code-video-toolkit
2. python3 -m pip install -r tools/requirements.txt
3. claude (opens Claude Code in the toolkit)
4. In Claude Code: /setup (cloud GPU + storage + voice, ~5 min)
5. /video to create your first project
Requirements: Node.js 18+, Claude Code, Python 3.9+ recommended, FFmpeg optional
Cost: Free (MIT license). Cloud GPU via Modal (recommended, $30/ m o n t h f ree t i er) or R u n P o d (p a y - p er - seco n d, n o minim u m s) . T y p i c a l u s a g e$ 1–2/month.
Integration notes:
- Ships with complete .claude/ directory — skills and commands are installed by cloning the repo.
- Brand profiles plug into /video; new projects auto-apply selected brand.
- project.json and per-project CLAUDE.md enable instant context resume across Claude Code sessions.
- Can be used as a reference architecture for other Claude Code-native vertical toolkits. See Cross-Topic Connections for the broader agent-primitive landscape this fits into.

HeyGen Studio Automation — the commercial-stack analog of this toolkit’s architectural pattern
video-use (browser-use) — narrower scope (editing-by-conversation only, hosted ASR as the only paid dep) vs. this toolkit’s full OSS model stack. Overlapping goal: Claude Code owns video production.
Remotion Motion Graphics — underlying video framework both toolkits share
HeyGen Hyperframes — alternative HTML-composition framework with its own skill bundle
HeyGen Avatar V — commercial alternative to SadTalker in this toolkit
Higgsfield Overview — another API-first video-generation layer; could slot into this toolkit as an LTX-2 alternative
Higgsfield Image-to-Video — Higgsfield’s image-to-video models parallel LTX-2 here
Claude Code Routines — schedule toolkit commands to run unattended
Building Agents with Skills — the 10-skill bundle here is a concrete instance of the pattern
Plugins and Marketplaces — this toolkit is a candidate for plugin-registry distribution
Frontend Design Deep Dive — the frontend-design skill bundled here
awesome-design-md — DESIGN.md files could plug into the brands/ system to extend brand styling beyond colors/fonts
Banned AI Patterns — quality bar applicable to output from this toolkit
Cross-Topic Connections — fits the broader Claude Code automation landscape

Open Questions

LTX-2 vs alternatives. LTX-2.3 22B at $0.23/generation is the toolkit’s built-in video generator. How does quality compare against Higgsfield’s Kling v2.1 Pro or Bytedance Seedance? Both are newer; drop-in swap appears possible.
Plugin-registry distribution. Would this work as a single installable plugin bundle per Plugins and Marketplaces, or is its file layout too opinionated for the plugin format?
Skill portability. The 10 skills are in .claude/skills/. Could they be extracted and published as a standalone skill pack usable outside the toolkit’s project structure?
Agent Skills open-standard fit. Per Building Agents with Skills, Agent Skills is becoming an open standard. Are these skills compliant with the emerging spec, or do they rely on Claude Code-specific conventions?
Commercial avatar fallback. SadTalker is the toolkit’s avatar path. For polished customer-facing output, is there a recommended path to swap in HeyGen Avatar V?
Voice-cloning legality. /voice-clone captures and stores a cloned voice. What’s the toolkit’s stated posture on consent and use restrictions? Not explicitly covered in the README.

Try It

Zero-setup smoke test. cd examples/hello-world && npm install && npm run render. No API keys needed — produces an MP4 immediately. Confirms Remotion rendering works on your machine.
Full /setup pass. Clone the repo, open in Claude Code, run /setup. Deploy the cloud GPU tools to Modal. Measure total setup time and compute budget used on your Modal account.
Clone + customize one template. Copy templates/sprint-review-v2/ to your own folder, run /video against it, generate a 60-second sample using the default brand. Teaches the template pattern.
Author one brand profile. Build brands/my-org/ with your real colors, fonts, and an ElevenLabs voice ID. Re-run /video — the new brand should apply automatically. This teaches the brand-layer pattern other tools don’t have.
Swap one OSS model for a commercial one. Replace the SadTalker talking-head step with HeyGen Avatar V calls in a fork. Measures how portable the architecture is vs. locked to the bundled stack.
Read the per-project CLAUDE.md. Inspect the auto-generated CLAUDE.md in a running project. It’s a good reference for how to structure your own project-level Claude context.
Compare cost per minute of finished video against HeyGen Studio Automation for the same content. Hard data for the OSS-vs-SaaS decision.

Jonathon's AI Wiki

Explorer

Claude Code Video Toolkit (Digital Samba)

Key Takeaways

Positioning vs HeyGen Studio Automation

The 10-skill bundle decoded

The 13 command set as a workflow

Cloud GPU economics

Example videos shipped

Implementation

Open Questions

Try It

Graph View

Table of Contents

Backlinks

Jonathon's AI Wiki

Explorer

Claude Code Video Toolkit (Digital Samba)

Key Takeaways

Positioning vs HeyGen Studio Automation

The 10-skill bundle decoded

The 13 command set as a workflow

Cloud GPU economics

Example videos shipped

Implementation

Related

Open Questions

Try It

Graph View

Table of Contents

Backlinks