← Back to Skills Marketplace

Video Understanding

Name: Video Understanding
Author: bill492

by bill492 · GitHub ↗ · v1.1.0

cross-platform ✓ Security Clean

995

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install video-understanding

Description

Analyze and summarize videos from 1000+ sites using Google Gemini AI, providing transcripts, descriptions, summaries, and answers to questions.

Usage Guidance

This skill appears to do what it says: it downloads videos (via yt-dlp), may remux/merge with ffmpeg, and uploads the content to Google Gemini using the GEMINI_API_KEY for analysis. Before installing, consider: 1) Privacy and copyright — uploaded videos will be sent to Google's servers, so avoid private/confidential or copyrighted content unless you have rights. 2) Billing and limits — using Gemini File API may incur costs and has size limits; confirm your API plan. 3) Trusting yt-dlp downloads — yt-dlp executes network downloads and may write files locally; ensure you trust the video sources. 4) Runtime dependencies — brew will install system binaries and the Python google-genai package will be installed; review these if you have strict policy controls. If these behaviors are acceptable, the skill is internally consistent.

Capability Analysis

Type: OpenClaw Skill Name: video-understanding Version: 1.1.0 The skill bundle is classified as benign. It transparently describes its purpose to analyze videos using Google Gemini and `yt-dlp`. The `SKILL.md` contains no prompt injection attempts against the OpenClaw agent. The `analyze_video.py` script correctly uses `subprocess.run` with a list of arguments to invoke `yt-dlp`, preventing shell injection vulnerabilities from user-provided URLs. It handles temporary files and API keys (GEMINI_API_KEY) as expected for its stated functionality, without evidence of data exfiltration, persistence, or other malicious behaviors.

Capability Assessment

✓ Purpose & Capability

Name/description (video analysis with Gemini) aligns with required binaries (yt-dlp, ffmpeg), the GEMINI_API_KEY credential, and the included Python script which downloads/uploads videos and calls the Google GenAI client. The brew install entries for yt-dlp and ffmpeg are proportional and expected.

ℹ Instruction Scope

SKILL.md and the script instruct the agent to download non-YouTube videos locally via yt-dlp and upload them to the Gemini File API (YouTube URLs are passed directly). This is within scope, but it does mean the skill will transmit video content to Google's servers — users should be aware of privacy/copyright implications. The script does not read other system files or additional environment variables.

✓ Install Mechanism

Install spec only lists brew formulas for yt-dlp and ffmpeg (well-known packages). The Python dependency (google-genai) is declared in the script metadata and is expected to be installed by the runtime tooling; no arbitrary URL downloads or unknown hosts are used in the install spec.

✓ Credentials

Only GEMINI_API_KEY is required and is used as the API key for the Google GenAI client. No unrelated secrets, tokens, or config paths are requested or read.

✓ Persistence & Privilege

always is false and the skill does not attempt to persist or modify other skills or system-wide settings. It cleans up uploaded Gemini files and local downloads by default (unless keep is used).

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install video-understanding
After installation, invoke the skill by name or use /video-understanding
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.1.0

Added proper metadata: declared yt-dlp, ffmpeg, and GEMINI_API_KEY requirements in frontmatter.

v1.0.1

Gemini-powered video analysis. Transcript, description, summary from any URL. 1000+ sites via yt-dlp.

v1.0.0

Initial release: Gemini-powered video analysis with transcript, description, summary. Supports 1000+ sites via yt-dlp. YouTube URL passthrough. Custom questions via -q flag.

Metadata

Slug video-understanding

Version 1.1.0

License —

All-time Installs 6

Active Installs 6

Total Versions 3

Frequently Asked Questions

What is Video Understanding?

Analyze and summarize videos from 1000+ sites using Google Gemini AI, providing transcripts, descriptions, summaries, and answers to questions. It is an AI Agent Skill for Claude Code / OpenClaw, with 995 downloads so far.

How do I install Video Understanding?

Run "/install video-understanding" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Video Understanding free?

Yes, Video Understanding is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Video Understanding support?

Video Understanding is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Video Understanding?

It is built and maintained by bill492 (@bill492); the current version is v1.1.0.

More Skills