Source: Bradautomates Claude Video 2026 04 29
bradautomates/claude-video is a Claude skill that closes the gap between Claude’s reading/synthesis capabilities and video input. It registers a /watch command that accepts a video URL or local file path, downloads via yt-dlp (300+ sources — YouTube, TikTok, Vimeo, Instagram, X), extracts duration-aware frame samples with ffmpeg, and transcribes using either native captions (free) or a Whisper API fallback (Groq preferred, OpenAI as alternate). MIT, 117 stars / 35 forks, latest release v0.1.2 (April 24, 2026). Multi-runtime: ships as a Claude Code plugin, a .skill upload for claude.ai, a Codex skill, and a manual clone for any agent that reads ~/.claude/skills/.
Key Takeaways
- Solves the “Claude can’t watch videos” problem. Without this skill, Claude has no native ability to ingest video. With it installed,
/watch <url-or-path>returns timestamped frames + transcript ready for multimodal reasoning. - Adaptive frame budget. ≤30 s videos get ~30 frames; 1–3 min get ~60; 3–10 min get ~80; >10 min triggers a sparse-scan warning recommending focused re-runs over a time range. Hard caps at 2 fps minimum and 100 frames maximum keep token spend predictable.
- Dual transcription path. Pulls native captions first (free, requires no API key). Falls back to Whisper via Groq (preferred) or OpenAI when no captions exist.
--no-whisperdisables audio entirely;--whisper groq|openaiforces backend. - Multi-runtime install. Same repo ships four ways:
/plugin marketplace add bradautomates/claude-videofor Claude Code,watch.skilldownload for claude.ai web,git cloneto~/.codex/skills/watchfor Codex, manual clone to~/.claude/skills/watchfor anything else. Samecommands/+SKILL.mdpattern referenced in the project’s skill bundle build notes. - Stdlib-only API clients.
whisper.pytalks to Groq and OpenAI without third-party SDKs — nopip install openaietc. required beyond yt-dlp + ffmpeg. - 300+ sources via yt-dlp. YouTube is the headline case but TikTok, Vimeo, Instagram, X, and the rest of the yt-dlp supported-sites list work the same way. No private/authenticated-platform support.
- Time-range slicing.
--start MM:SS --end MM:SSclips the analysis window — the canonical workaround for >10-minute videos and the way to keep token budgets manageable on long-form content. - Optimal under 10 minutes. Beyond that, the skill warns that frame coverage gets sparse. Practical pattern: identify the section of interest with a transcript-only pass, then
--start/--endfor visual analysis.
Implementation
Tool/Service: bradautomates/claude-video (MIT)
Setup (Claude Code):
/plugin marketplace add bradautomates/claude-video
/plugin install watch@claude-videoSetup (claude.ai web): Download watch.skill from the GitHub releases page, then in claude.ai go to Settings → Capabilities → Skills and upload it.
Setup (Codex CLI):
git clone https://github.com/bradautomates/claude-video ~/.codex/skills/watchSetup (manual / any agent that reads ~/.claude/skills/):
git clone https://github.com/bradautomates/claude-video ~/.claude/skills/watchRequired dependencies: ffmpeg, yt-dlp. The skill includes a setup.py preflight that auto-installs on macOS via brew; on Linux it prints apt / dnf / pipx guidance; on Windows it prints winget / pip guidance.
Optional dependencies: Groq API key (preferred for Whisper) or OpenAI API key. Without either, the skill uses native captions only — most YouTube videos have them, so this is often fine.
Cost: Free (MIT). Whisper transcription billed per Groq/OpenAI pricing; native captions free.
Configuration flags:
--max-frames N— cap the frame budget for token control--resolution W— bump to 1024px for text-heavy videos (slides, screen recordings)--fps F— override auto-fps (2 fps minimum)--whisper groq|openai— pick transcription backend--no-whisper— captions only, skip audio transcription--out-dir DIR— control where downloads + frames land--start MM:SS --end MM:SS— time-range slicing for long content
Integration notes:
- Claude Code’s auto-triggered slash command surface picks up
/watchonce the plugin is installed. Pair with scheduled tasks to auto-summarize incoming videos on a cron. - For video-side research workflows (AI Video & Content Production), this is the input-side counterpart to production-side tools like HeyGen Studio Automation and HeyGen Hyperframes — claude-video reads source material in, those write polished output back out.
- For competitor research workflows,
/watch <competitor-video> identify hooks and creative patternsextracts structure from ad creative and content pieces. Pairs naturally with the last30days-skill research-aggregator surface. - Whisper has a 25 MB upload limit (~50 minutes mono 16 kHz audio). For long talks, transcribe in chunks via
--start/--endor rely on native captions.
Try It
- Install via
/plugin marketplace add bradautomates/claude-videothen/plugin install watch@claude-video. Verify the preflight check passes (ffmpeg+yt-dlpon$PATH). - Run
/watch https://youtu.be/<id> what is the hook in the first 10 seconds?against a recent ad to learn the structure analysis pattern. - For a screen recording bug report:
/watch ~/screen-recording.mp4 when does the UI break and what's on screen at that moment?— the timestamped frames + transcript narrows down the failure. - For long-form content: do a transcript-only pass (
--no-whisper+ native captions or Whisper-only with--max-frames 0if supported, otherwise lowest frame count), identify the segment of interest, then re-run with--start MM:SS --end MM:SSfor visual analysis. - Pair with yt-dlp’s nightly channel if you analyze TikTok/Instagram regularly — those platforms change frequently and the stable yt-dlp release can lag.
Related
- yt-dlp — required dependency; the underlying CLI that downloads from 300+ video platforms
- Claude Code Plugins and Marketplaces — the distribution surface for
/plugin marketplace add bradautomates/claude-video - Agent Skills Overview — the open standard
claude-videoships against (multi-runtime install pattern) - last30days-skill — sibling research skill that also depends on yt-dlp; complementary surface for video-source research
- AI Video & Content Production — adjacent topic for production-side video tools (HeyGen, Higgsfield, Remotion)
- HeyGen Hyperframes — production-side counterpart (writes video out; claude-video reads video in)
- Claude Code Scheduled Tasks — natural pairing for batch / cron-triggered video summarization