← Back to Skills Marketplace
995
Downloads
4
Stars
6
Active Installs
3
Versions
Install in OpenClaw
/install video-understanding
Description
Analyze and summarize videos from 1000+ sites using Google Gemini AI, providing transcripts, descriptions, summaries, and answers to questions.
Usage Guidance
This skill appears to do what it says: it downloads videos (via yt-dlp), may remux/merge with ffmpeg, and uploads the content to Google Gemini using the GEMINI_API_KEY for analysis. Before installing, consider: 1) Privacy and copyright — uploaded videos will be sent to Google's servers, so avoid private/confidential or copyrighted content unless you have rights. 2) Billing and limits — using Gemini File API may incur costs and has size limits; confirm your API plan. 3) Trusting yt-dlp downloads — yt-dlp executes network downloads and may write files locally; ensure you trust the video sources. 4) Runtime dependencies — brew will install system binaries and the Python google-genai package will be installed; review these if you have strict policy controls. If these behaviors are acceptable, the skill is internally consistent.
Capability Analysis
Type: OpenClaw Skill
Name: video-understanding
Version: 1.1.0
The skill bundle is classified as benign. It transparently describes its purpose to analyze videos using Google Gemini and `yt-dlp`. The `SKILL.md` contains no prompt injection attempts against the OpenClaw agent. The `analyze_video.py` script correctly uses `subprocess.run` with a list of arguments to invoke `yt-dlp`, preventing shell injection vulnerabilities from user-provided URLs. It handles temporary files and API keys (GEMINI_API_KEY) as expected for its stated functionality, without evidence of data exfiltration, persistence, or other malicious behaviors.
Capability Assessment
Purpose & Capability
Name/description (video analysis with Gemini) aligns with required binaries (yt-dlp, ffmpeg), the GEMINI_API_KEY credential, and the included Python script which downloads/uploads videos and calls the Google GenAI client. The brew install entries for yt-dlp and ffmpeg are proportional and expected.
Instruction Scope
SKILL.md and the script instruct the agent to download non-YouTube videos locally via yt-dlp and upload them to the Gemini File API (YouTube URLs are passed directly). This is within scope, but it does mean the skill will transmit video content to Google's servers — users should be aware of privacy/copyright implications. The script does not read other system files or additional environment variables.
Install Mechanism
Install spec only lists brew formulas for yt-dlp and ffmpeg (well-known packages). The Python dependency (google-genai) is declared in the script metadata and is expected to be installed by the runtime tooling; no arbitrary URL downloads or unknown hosts are used in the install spec.
Credentials
Only GEMINI_API_KEY is required and is used as the API key for the Google GenAI client. No unrelated secrets, tokens, or config paths are requested or read.
Persistence & Privilege
always is false and the skill does not attempt to persist or modify other skills or system-wide settings. It cleans up uploaded Gemini files and local downloads by default (unless keep is used).
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install video-understanding - After installation, invoke the skill by name or use
/video-understanding - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.1.0
Added proper metadata: declared yt-dlp, ffmpeg, and GEMINI_API_KEY requirements in frontmatter.
v1.0.1
Gemini-powered video analysis. Transcript, description, summary from any URL. 1000+ sites via yt-dlp.
v1.0.0
Initial release: Gemini-powered video analysis with transcript, description, summary. Supports 1000+ sites via yt-dlp. YouTube URL passthrough. Custom questions via -q flag.
Metadata
Frequently Asked Questions
What is Video Understanding?
Analyze and summarize videos from 1000+ sites using Google Gemini AI, providing transcripts, descriptions, summaries, and answers to questions. It is an AI Agent Skill for Claude Code / OpenClaw, with 995 downloads so far.
How do I install Video Understanding?
Run "/install video-understanding" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Video Understanding free?
Yes, Video Understanding is completely free (open-source). You can download, install and use it at no cost.
Which platforms does Video Understanding support?
Video Understanding is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Video Understanding?
It is built and maintained by bill492 (@bill492); the current version is v1.1.0.
More Skills