Source: ai-research/ssrajadh-sentrysearch-2026-05-28.md

Repo: https://github.com/ssrajadh/sentrysearch Language: Python (requires 3.11 or 3.12 — PyTorch wheels don’t yet support 3.13+) License: ^[inferred] — not surfaced explicitly in the extracted README sections

Local-first CLI that indexes video footage and searches it by natural language using Gemini Embedding 2 (cloud) or Qwen3-VL (local or via DashScope cloud). Originally shaped around Tesla Sentry / dashcam workflows, with sister CLIs SentryMerge (cam-config + reruns) and SentryBlur (face / license-plate / natural-language redaction) composing into a search → trim → redact pipeline. Distinct from the wiki’s existing video tooling (HeyGen avatar/Studio, Higgsfield, OpenCut, OpenMediaTools, video-use) because it occupies the search slot — operator-side discovery and retrieval inside an existing archive, not generation or assembly.

Key Takeaways

  • Two embedding backends, operator picks the trade. --backend qwen-cloud runs Qwen3-VL via DashScope (DASHSCOPE_API_KEY); --backend local runs Qwen3-VL locally. The Gemini Embedding 2 path uses an aistudio.google.com/apikey API key (set spending limit at aistudio.google.com/billing) and is validated by sentrysearch init with a test embedding.
  • Chunked-and-overlapped indexing model. Defaults: 30s chunks with 5s overlap, target resolution 480, target FPS 5, still frames skipped (--no-skip-still to keep them). The combination keeps the indexed unit search-friendly without losing transition moments between chunks.
  • Confidence threshold is operator-tunable. Default 0.41 best-result similarity. Below threshold, the CLI prompts (No confident match found (best score: 0.28). Show results anyway? [y/N]:) before trimming. --no-trim flips the behavior to “show with a note instead of prompting.”
  • ~/.sentrysearch/last_clip.json is the cross-tool handoff cache. SentryBlur reads it via --last so search-then-redact is two commands and no manual path-passing. SentryMerge follows the same --last convention for re-running a query.
  • Tesla Sentry default, but cam-agnostic via SentryMerge. SentryMerge ships a “modular cam-config system for non-Tesla dashcams” — the toolchain leaves the Tesla origin behind without breaking the operator workflow.
  • ffmpeg is required but bootstraps gracefully. Bundled imageio-ffmpeg is used automatically if no system-wide ffmpeg is present.
  • Python 3.11 / 3.12 pin is load-bearing. PyTorch wheels don’t yet support 3.13+; if the operator’s default Python is newer, the README walks installing a managed 3.12 and pinning the tool to it.

How it fits the wiki’s video tools

Where the existing topic articles sit:

SlotArticleWhat it does
Generation (avatar)HeyGen Avatar VWebcam clip → unlimited duration twin
Generation (video)Higgsfield MCPConversational image/video generation across 30+ models
CompositionHyperframesHTML-based video composition framework
Composition (programmatic)RemotionReact-based motion graphics
Assembly (agent-native)OpenCutOSS NLE with MCP + headless mode
Assembly (browser tools)OpenMediaToolsWASM ffmpeg, in-browser conversion
Editing (conversational)video-useTranscript-driven cuts via Claude Code skill
Search / DiscoverySentrySearch (this article)Semantic search over an existing video archive

The search slot wasn’t covered before this ingest; the topic gained a new structural cell.

  • AI Video & Content Production — topic index
  • OpenCut — agent-native NLE; pairs with SentrySearch for “search-then-edit” pipelines on dashcam or archive footage
  • OpenMediaTools — browser-native ffmpeg suite; companion utility-tier preprocessing
  • video-use — closest existing tool by spirit (transcript-driven retrieval inside an existing video); SentrySearch is the visual-embedding analogue
  • inbox-refresh — adjacent pattern: search-and-stage workflow against an inbox of sources
  • FLUQs framework — adjacent semantic-retrieval framing for content
  • HeyGen Instant Highlights V2 — same long-form → relevant clip shape; HeyGen does it via highlight-detection heuristics, SentrySearch does it via semantic query

Try It

# Python 3.12 pin if your default is newer
brew install python@3.12   # or equivalent on Linux/Windows
 
# Install (path documented in the repo)
sentrysearch init           # prompts for Gemini API key, validates with test embedding
 
# Index a folder of dashcam clips
sentrysearch index ./footage \
  --chunk-duration 30 --overlap 5 \
  --target-resolution 480 --target-fps 5 \
  --backend qwen-cloud
 
# Search
sentrysearch search "white sedan pulling out of driveway"
 
# Redact (sibling tool — picks up ~/.sentrysearch/last_clip.json via --last)
sentryblur --last

Pre-flight: cap spending at https://aistudio.google.com/billing before bulk indexing — Gemini Embedding 2 calls are metered.

Open Questions

  • License not visible in the extracted README. GitHub repo page extract showed no LICENSE callout; verify before commercial use.
  • DashScope rate / pricing for Qwen3-VL cloud backend. The DashScope key path is documented but cost/quota isn’t surfaced in the README excerpt.
  • Embedding portability. Does the index survive a backend swap (Gemini ↔ Qwen) or does swapping invalidate the index? Affects re-indexing cost when the operator switches providers.
  • Non-Tesla cam-config rough edges. SentryMerge claims a modular cam-config system; coverage of common dashcam vendors (Garmin, Nextbase, Viofo, BlackVue) would inform whether the abstraction holds for the obvious adjacent cases.
  • Whose archive is realistic? Default 30s chunks at 480p / 5 FPS reads as Sentry-archive-sized. Operator-scale guidance for hour-or-day-long indexed bodies (storage, query latency, embedding cost) would help calibrate use beyond the dashcam origin.