← Back to Skills Marketplace

Video Understanding

Name: Video Understanding
Author: jackeven02

by Jackeven02 · GitHub ↗ · v1.0.0

cross-platform ⚠ suspicious

2108

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install video-learn

Description

视频理解与分析能力 - 让 AI 能够理解视频内容、提取关键信息。当用户要求分析视频、理解视频内容、总结视频、提取视频要点时触发此技能。

Usage Guidance

This skill appears to do what it says: fetch video metadata and summarize content. Things to consider before installing or using it: - If you want full transcription/speech-to-text, the skill suggests downloading audio/subtitles and using extra tools (e.g., yt-dlp plus an STT service). Installing yt-dlp (pip) or using an external STT service can download/process user content and may send it to third-party servers — only proceed if you trust those tools/services. - The YouTube Data API example needs an API_KEY; only provide API keys you trust and limit their scope. The skill itself does not request any credentials, but you may be prompted to supply them if you want API-based metadata rather than simple web scraping. - Downloading videos can touch copyrighted or private material. Ensure you have permission before downloading or transcribing content. - If you require stricter safety, ask for an explicit statement of where transcriptions or downloaded media will be sent (local-only processing vs. remote transcription API) and whether the agent will persist files on disk. Overall this skill is coherent and proportional to its purpose, but be mindful of optional external installs and transcription service choices.

Capability Analysis

Type: OpenClaw Skill Name: video-learn Version: 1.0.0 The skill aims to understand and analyze video content. The `references/resources.md` file documents the `yt-dlp` command-line tool, providing examples like `yt-dlp "URL"` and `yt-dlp --write-subs "URL"`. Given that `SKILL.md` mentions 'speech to text (needs extra tools)', it is highly probable the AI agent will be instructed to use `yt-dlp` with user-provided video URLs. Executing `yt-dlp` with unsanitized user input creates a critical shell injection vulnerability, leading to potential Remote Code Execution (RCE). This constitutes a significant security risk, classifying the skill as suspicious due to this severe vulnerability.

Capability Assessment

✓ Purpose & Capability

The name/description (video understanding, summarization, extracting titles/descriptions/duration) matches the instructions and resources which reference YouTube/Bilibili/Tencent APIs and optional download tools. Nothing requested or described is unrelated to video analysis.

ℹ Instruction Scope

Instructions remain within the stated purpose (identify platform, fetch metadata, extract subtitles/audio, summarize). They mention using third-party APIs and downloading audio/subtitles for speech-to-text, but do not instruct reading unrelated local files or exfiltrating data. Note: using download + STT may cause the agent to fetch and store video/audio data — a privacy consideration rather than an incoherence.

ℹ Install Mechanism

The skill is instruction-only and includes no install spec (low static risk). The resources advise installing yt-dlp via pip for downloads; that is an optional external package installation the user/agent would perform and is not bundled. Installing third-party packages (pip/ PyPI) is a normal choice but increases attack surface if done automatically — the bundle itself does not perform that install.

✓ Credentials

The skill declares no required env vars or credentials. The references correctly show where a YouTube Data API key would be used if the agent opts to call that API. No unrelated secrets or config paths are requested.

✓ Persistence & Privilege

The skill does not request persistent presence (always:false), does not modify other skills/config, and does not declare privileged system access. Autonomous invocation is allowed by platform default but is not combined with other high-risk factors here.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install video-learn
After installation, invoke the skill by name or use /video-learn
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

Initial release of video-understanding skill. - Enables AI to analyze and summarize video content from popular platforms (YouTube, Bilibili, etc.) - Extracts video title, description, duration, and key points - Supports video content summarization and highlights extraction - Detects user intent based on requests such as "analyze," "summarize," or "extract" video content - Cannot play videos or process raw visuals; relies on metadata and available APIs

Metadata

Slug video-learn

Version 1.0.0

License —

All-time Installs 19

Active Installs 19

Total Versions 1

Frequently Asked Questions

What is Video Understanding?

视频理解与分析能力 - 让 AI 能够理解视频内容、提取关键信息。当用户要求分析视频、理解视频内容、总结视频、提取视频要点时触发此技能。 It is an AI Agent Skill for Claude Code / OpenClaw, with 2108 downloads so far.

How do I install Video Understanding?

Run "/install video-learn" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Video Understanding free?

Yes, Video Understanding is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Video Understanding support?

Video Understanding is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Video Understanding?

It is built and maintained by Jackeven02 (@jackeven02); the current version is v1.0.0.

More Skills