Source: ai-research/ssrajadh-sentrysearch-2026-05-28.md
Repo: https://github.com/ssrajadh/sentrysearch Language: Python (requires 3.11 or 3.12 — PyTorch wheels don’t yet support 3.13+) License: ^[inferred] — not surfaced explicitly in the extracted README sections
Local-first CLI that indexes video footage and searches it by natural language using Gemini Embedding 2 (cloud) or Qwen3-VL (local or via DashScope cloud). Originally shaped around Tesla Sentry / dashcam workflows, with sister CLIs SentryMerge (cam-config + reruns) and SentryBlur (face / license-plate / natural-language redaction) composing into a search → trim → redact pipeline. Distinct from the wiki’s existing video tooling (HeyGen avatar/Studio, Higgsfield, OpenCut, OpenMediaTools, video-use) because it occupies the search slot — operator-side discovery and retrieval inside an existing archive, not generation or assembly.
Key Takeaways
- Two embedding backends, operator picks the trade.
--backend qwen-cloudruns Qwen3-VL via DashScope (DASHSCOPE_API_KEY);--backend localruns Qwen3-VL locally. The Gemini Embedding 2 path uses anaistudio.google.com/apikeyAPI key (set spending limit at aistudio.google.com/billing) and is validated bysentrysearch initwith a test embedding. - Chunked-and-overlapped indexing model. Defaults: 30s chunks with 5s overlap, target resolution 480, target FPS 5, still frames skipped (
--no-skip-stillto keep them). The combination keeps the indexed unit search-friendly without losing transition moments between chunks. - Confidence threshold is operator-tunable. Default 0.41 best-result similarity. Below threshold, the CLI prompts (
No confident match found (best score: 0.28). Show results anyway? [y/N]:) before trimming.--no-trimflips the behavior to “show with a note instead of prompting.” ~/.sentrysearch/last_clip.jsonis the cross-tool handoff cache. SentryBlur reads it via--lastso search-then-redact is two commands and no manual path-passing. SentryMerge follows the same--lastconvention for re-running a query.- Tesla Sentry default, but cam-agnostic via SentryMerge. SentryMerge ships a “modular cam-config system for non-Tesla dashcams” — the toolchain leaves the Tesla origin behind without breaking the operator workflow.
- ffmpeg is required but bootstraps gracefully. Bundled
imageio-ffmpegis used automatically if no system-wide ffmpeg is present. - Python 3.11 / 3.12 pin is load-bearing. PyTorch wheels don’t yet support 3.13+; if the operator’s default Python is newer, the README walks installing a managed 3.12 and pinning the tool to it.
How it fits the wiki’s video tools
Where the existing topic articles sit:
| Slot | Article | What it does |
|---|---|---|
| Generation (avatar) | HeyGen Avatar V | Webcam clip → unlimited duration twin |
| Generation (video) | Higgsfield MCP | Conversational image/video generation across 30+ models |
| Composition | Hyperframes | HTML-based video composition framework |
| Composition (programmatic) | Remotion | React-based motion graphics |
| Assembly (agent-native) | OpenCut | OSS NLE with MCP + headless mode |
| Assembly (browser tools) | OpenMediaTools | WASM ffmpeg, in-browser conversion |
| Editing (conversational) | video-use | Transcript-driven cuts via Claude Code skill |
| Search / Discovery | SentrySearch (this article) | Semantic search over an existing video archive |
The search slot wasn’t covered before this ingest; the topic gained a new structural cell.
Related
- AI Video & Content Production — topic index
- OpenCut — agent-native NLE; pairs with SentrySearch for “search-then-edit” pipelines on dashcam or archive footage
- OpenMediaTools — browser-native ffmpeg suite; companion utility-tier preprocessing
- video-use — closest existing tool by spirit (transcript-driven retrieval inside an existing video); SentrySearch is the visual-embedding analogue
- inbox-refresh — adjacent pattern: search-and-stage workflow against an inbox of sources
- FLUQs framework — adjacent semantic-retrieval framing for content
- HeyGen Instant Highlights V2 — same long-form → relevant clip shape; HeyGen does it via highlight-detection heuristics, SentrySearch does it via semantic query
Try It
# Python 3.12 pin if your default is newer
brew install python@3.12 # or equivalent on Linux/Windows
# Install (path documented in the repo)
sentrysearch init # prompts for Gemini API key, validates with test embedding
# Index a folder of dashcam clips
sentrysearch index ./footage \
--chunk-duration 30 --overlap 5 \
--target-resolution 480 --target-fps 5 \
--backend qwen-cloud
# Search
sentrysearch search "white sedan pulling out of driveway"
# Redact (sibling tool — picks up ~/.sentrysearch/last_clip.json via --last)
sentryblur --lastPre-flight: cap spending at https://aistudio.google.com/billing before bulk indexing — Gemini Embedding 2 calls are metered.
Open Questions
- License not visible in the extracted README. GitHub repo page extract showed no LICENSE callout; verify before commercial use.
- DashScope rate / pricing for Qwen3-VL cloud backend. The DashScope key path is documented but cost/quota isn’t surfaced in the README excerpt.
- Embedding portability. Does the index survive a backend swap (Gemini ↔ Qwen) or does swapping invalidate the index? Affects re-indexing cost when the operator switches providers.
- Non-Tesla cam-config rough edges. SentryMerge claims a modular cam-config system; coverage of common dashcam vendors (Garmin, Nextbase, Viofo, BlackVue) would inform whether the abstraction holds for the obvious adjacent cases.
- Whose archive is realistic? Default 30s chunks at 480p / 5 FPS reads as Sentry-archive-sized. Operator-scale guidance for hour-or-day-long indexed bodies (storage, query latency, embedding cost) would help calibrate use beyond the dashcam origin.