SentrySearch — Semantic Search Over Videos (Gemini + Qwen3-VL)

Source: ai-research/ssrajadh-sentrysearch-2026-05-28.md

Repo: https://github.com/ssrajadh/sentrysearch Language: Python (requires 3.11 or 3.12 — PyTorch wheels don’t yet support 3.13+) License: ^[inferred] — not surfaced explicitly in the extracted README sections

Local-first CLI that indexes video footage and searches it by natural language using Gemini Embedding 2 (cloud) or Qwen3-VL (local or via DashScope cloud). Originally shaped around Tesla Sentry / dashcam workflows, with sister CLIs SentryMerge (cam-config + reruns) and SentryBlur (face / license-plate / natural-language redaction) composing into a search → trim → redact pipeline. Distinct from the wiki’s existing video tooling (HeyGen avatar/Studio, Higgsfield, OpenCut, OpenMediaTools, video-use) because it occupies the search slot — operator-side discovery and retrieval inside an existing archive, not generation or assembly.

Key Takeaways

Two embedding backends, operator picks the trade. --backend qwen-cloud runs Qwen3-VL via DashScope (DASHSCOPE_API_KEY); --backend local runs Qwen3-VL locally. The Gemini Embedding 2 path uses an aistudio.google.com/apikey API key (set spending limit at aistudio.google.com/billing) and is validated by sentrysearch init with a test embedding.
Chunked-and-overlapped indexing model. Defaults: 30s chunks with 5s overlap, target resolution 480, target FPS 5, still frames skipped (--no-skip-still to keep them). The combination keeps the indexed unit search-friendly without losing transition moments between chunks.
Confidence threshold is operator-tunable. Default 0.41 best-result similarity. Below threshold, the CLI prompts (No confident match found (best score: 0.28). Show results anyway? [y/N]:) before trimming. --no-trim flips the behavior to “show with a note instead of prompting.”
~/.sentrysearch/last_clip.json is the cross-tool handoff cache. SentryBlur reads it via --last so search-then-redact is two commands and no manual path-passing. SentryMerge follows the same --last convention for re-running a query.
Tesla Sentry default, but cam-agnostic via SentryMerge. SentryMerge ships a “modular cam-config system for non-Tesla dashcams” — the toolchain leaves the Tesla origin behind without breaking the operator workflow.
ffmpeg is required but bootstraps gracefully. Bundled imageio-ffmpeg is used automatically if no system-wide ffmpeg is present.
Python 3.11 / 3.12 pin is load-bearing. PyTorch wheels don’t yet support 3.13+; if the operator’s default Python is newer, the README walks installing a managed 3.12 and pinning the tool to it.

How it fits the wiki’s video tools

Where the existing topic articles sit:

Slot	Article	What it does
Generation (avatar)	[[ai-video-content/heygen-avatar-v	HeyGen Avatar V]]
Generation (video)	[[ai-video-content/higgsfield-mcp	Higgsfield MCP]]
Composition	[[ai-video-content/hyperframes	Hyperframes]]
Composition (programmatic)	[[ai-video-content/remotion-motion-graphics	Remotion]]
Assembly (agent-native)	[[ai-video-content/opencut	OpenCut]]
Assembly (browser tools)	[[ai-video-content/openmediatools	OpenMediaTools]]
Editing (conversational)	[[ai-video-content/video-use	video-use]]
Search / Discovery	SentrySearch (this article)	Semantic search over an existing video archive

The search slot wasn’t covered before this ingest; the topic gained a new structural cell.

AI Video & Content Production — topic index
OpenCut — agent-native NLE; pairs with SentrySearch for “search-then-edit” pipelines on dashcam or archive footage
OpenMediaTools — browser-native ffmpeg suite; companion utility-tier preprocessing
video-use — closest existing tool by spirit (transcript-driven retrieval inside an existing video); SentrySearch is the visual-embedding analogue
inbox-refresh — adjacent pattern: search-and-stage workflow against an inbox of sources
FLUQs framework — adjacent semantic-retrieval framing for content
HeyGen Instant Highlights V2 — same long-form → relevant clip shape; HeyGen does it via highlight-detection heuristics, SentrySearch does it via semantic query

Try It

# Python 3.12 pin if your default is newer
brew install python@3.12   # or equivalent on Linux/Windows
 
# Install (path documented in the repo)
sentrysearch init           # prompts for Gemini API key, validates with test embedding
 
# Index a folder of dashcam clips
sentrysearch index ./footage \
  --chunk-duration 30 --overlap 5 \
  --target-resolution 480 --target-fps 5 \
  --backend qwen-cloud
 
# Search
sentrysearch search "white sedan pulling out of driveway"
 
# Redact (sibling tool — picks up ~/.sentrysearch/last_clip.json via --last)
sentryblur --last

Pre-flight: cap spending at https://aistudio.google.com/billing before bulk indexing — Gemini Embedding 2 calls are metered.

Open Questions

License not visible in the extracted README. GitHub repo page extract showed no LICENSE callout; verify before commercial use.
DashScope rate / pricing for Qwen3-VL cloud backend. The DashScope key path is documented but cost/quota isn’t surfaced in the README excerpt.
Embedding portability. Does the index survive a backend swap (Gemini ↔ Qwen) or does swapping invalidate the index? Affects re-indexing cost when the operator switches providers.
Non-Tesla cam-config rough edges. SentryMerge claims a modular cam-config system; coverage of common dashcam vendors (Garmin, Nextbase, Viofo, BlackVue) would inform whether the abstraction holds for the obvious adjacent cases.
Whose archive is realistic? Default 30s chunks at 480p / 5 FPS reads as Sentry-archive-sized. Operator-scale guidance for hour-or-day-long indexed bodies (storage, query latency, embedding cost) would help calibrate use beyond the dashcam origin.

Jonathon's AI Wiki

Explorer

SentrySearch — Semantic Search Over Videos (Gemini + Qwen3-VL)

Key Takeaways

How it fits the wiki’s video tools

Try It

Open Questions

Graph View

Table of Contents

Backlinks

Jonathon's AI Wiki

Explorer

SentrySearch — Semantic Search Over Videos (Gemini + Qwen3-VL)

Key Takeaways

How it fits the wiki’s video tools

Related

Try It

Open Questions

Graph View

Table of Contents

Backlinks