← Back to Skills Marketplace

video-understand

Name: video-understand
Author: sifr42

by sifr42 · GitHub ↗ · v1.0.1

cross-platform ⚠ suspicious

795

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install video-understand

Description

Analyze and understand video content using AI. Upload local files, YouTube URLs, or HTTP video URLs for detailed analysis, Q&A, and timestamped breakdowns.

Usage Guidance

Before installing or using this skill, consider the following: - Metadata mismatch: the registry claims no required env vars or binaries, but the docs require GEMINI_API_KEY or MOONSHOT_API_KEY, Node.js/npm, and optionally yt-dlp. Treat that as a red flag and verify the package source. - Verify the npm package: inspect the video-understand package on npm/GitHub (source code, maintainer, recent publishes) before running npm install -g; prefer installing from a source you trust. - Credential safety: the skill stores API keys (or reads env vars) and will upload video content to third‑party providers. Only provide keys for providers you trust and avoid uploading sensitive video content. - yt-dlp dependency: if you plan to use Kimi with YouTube, the skill relies on an external downloader (yt-dlp) which is not declared in the registry metadata—install that from an official package source and be cautious when running downloads. - Local files and cache: it creates ~/.video-understand/config.json and uploads cache; review and remove cached files if they contain sensitive material. - If you need higher assurance: request the skill's source code or the npm package tarball to review what it does locally (especially any code that would upload files or persist keys). If you cannot verify the package source, treat the skill as untrusted. If you plan to proceed, limit the API key scope (if supported), avoid uploading sensitive videos, and inspect the installed CLI's code before giving it credentials.

Capability Analysis

Type: OpenClaw Skill Name: video-understand Version: 1.0.1 The skill is classified as suspicious due to its reliance on external command execution and network requests, which introduce potential vulnerabilities. Specifically, the `SKILL.md` and `rules/install.md` files indicate that the Kimi provider downloads YouTube videos using `yt-dlp` and other HTTP videos via `fetch`. While `yt-dlp` is a legitimate tool, passing user-controlled URLs to an external command like `yt-dlp` without robust sanitization could lead to shell injection vulnerabilities. Although the `SKILL.md` includes a commendable warning against prompt injection from video content, the underlying mechanism of invoking external tools with potentially untrusted input remains a risk.

Capability Assessment

⚠ Purpose & Capability

The skill's stated purpose (analyzing videos via Gemini and Kimi) legitimately requires provider API keys and may need yt-dlp for YouTube downloads. However, the registry metadata lists no required env vars or binaries while the SKILL.md and install.md explicitly reference GEMINI_API_KEY, MOONSHOT_API_KEY, Node.js/npm, and yt-dlp. That discrepancy is an inconsistency between what the skill says it needs and what the registry claims.

ℹ Instruction Scope

SKILL.md stays on-topic (upload local files or URLs, analyze, ask follow-ups, list/delete uploads) and explicitly warns that third‑party video content is untrusted. It documents caching (~/.video-understand) and provider behavior. Nothing in the instructions attempts to read unrelated system files or exfiltrate secrets, but it does instruct uploads of potentially sensitive video content to external providers (privacy risk) and to run or rely on external tools (yt-dlp) that the registry did not declare as required.

ℹ Install Mechanism

There is no formal install spec in the registry (instruction-only), but rules/install.md directs users to install an npm package globally (npm install -g video-understand) and requires Node.js 18+. Installing an unvetted npm package has inherent risk—verify the package on npm and check its source. The install instructions for yt-dlp point to system package managers (winget/brew/apt/uv), which is expected for that tool but again is not declared in the registry metadata.

⚠ Credentials

The skill uses GEMINI_API_KEY and MOONSHOT_API_KEY (and suggests storing keys in ~/.video-understand/config.json), which are proportional to its function. The concern is that the registry declared no required env vars while the docs require API keys and may save them to disk — the metadata and the runtime instructions are out of sync, which could mislead users into granting credentials without realizing it.

ℹ Persistence & Privilege

The skill does not request elevated platform privileges and is not always-enabled. It stores config and upload caches under ~/.video-understand and may retain uploaded files (Kimi: persists until deleted; Gemini: ~48h). This is expected behavior but has privacy implications — users should be aware files are uploaded and cached locally and remotely.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install video-understand
After installation, invoke the skill by name or use /video-understand
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.1

- Added a third-party content warning to the documentation, advising users to treat analysis results from YouTube and arbitrary URLs as untrusted data. - No functional changes. Documentation update only.

v1.0.0

Initial release of video-understand — a CLI skill that gives AI agents the ability to analyze and understand video content, even with non-multimodal LLMs. - Supports local files, YouTube URLs, and HTTP video URLs - Core commands: analyze, ask (follow-up Q&A), upload, list, delete - Timestamped breakdowns, structured JSON output, and file export - Deduplicates uploads via file hash — no redundant re-uploads - Multiple providers: Google Gemini and Moonshot AI (Kimi) - Supports MP4, MOV, WebM, AVI, and more

Metadata

Slug video-understand

Version 1.0.1

License —

All-time Installs 3

Active Installs 3

Total Versions 2

Frequently Asked Questions

What is video-understand?

Analyze and understand video content using AI. Upload local files, YouTube URLs, or HTTP video URLs for detailed analysis, Q&A, and timestamped breakdowns. It is an AI Agent Skill for Claude Code / OpenClaw, with 795 downloads so far.

How do I install video-understand?

Run "/install video-understand" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is video-understand free?

Yes, video-understand is completely free (open-source). You can download, install and use it at no cost.

Which platforms does video-understand support?

video-understand is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created video-understand?

It is built and maintained by sifr42 (@sifr42); the current version is v1.0.1.

More Skills