← Back to Skills Marketplace
Video Skill
by
Michael Gold
· GitHub ↗
· v0.1.2
360
Downloads
1
Stars
2
Active Installs
3
Versions
Install in OpenClaw
/install video-skill
Description
Run the video-skill pipeline to convert narrated videos into structured step data and enriched timeline-ready outputs. Use when a user asks to process a vide...
Usage Guidance
This package appears coherent for its stated purpose, but review these before running: 1) Provider trust: the enrichment steps will send transcript text and base64-encoded frame images to whichever base_url you configure for 'reasoning' and 'vlm' — only point those at services you control or trust. 2) Model downloads & docker: bootstrap_models.sh will download large model files (requires HF CLI and an authenticated account) and the docker-compose file pulls images from GHCR — verify sources and run in a machine with sufficient disk/GPU or use an isolated VM/container. 3) Local commands & subprocesses: the tool invokes ffmpeg and subprocess.run (clip extraction); run on files you trust and consider limiting permissions/using a sandbox. 4) Config review: config.example.json leaves api_key_env null; if you populate api_key_env make sure env vars are set appropriately and contain only credentials for intended providers. 5) Minor docs inconsistency: SKILL.md says 'no repo clone required' but many commands expect a local repo — follow the README/INSTRUCTIONS for correct setup. If you need higher assurance, review the remaining truncated source files (settings and any network code) and run the pipeline in a disposable container before using with sensitive data.
Capability Analysis
Type: OpenClaw Skill
Name: video-skill
Version: 0.1.2
The OpenClaw AgentSkills bundle 'video-skill' appears benign. Its primary function is to process videos using AI models, involving file I/O, external API calls to configured endpoints, and local execution of `ffmpeg` for video/frame extraction. API keys are handled securely via environment variables. The `SKILL.md` and other documentation provide instructions for using the skill's CLI, without any evidence of prompt injection attempts designed to manipulate the AI agent into unauthorized actions. The use of `subprocess.run` with a list of arguments for `ffmpeg` commands in `src/video_skill_extractor/clips.py` and `src/video_skill_extractor/frames.py` mitigates common shell injection vulnerabilities, as arguments are passed directly to the executable rather than interpreted by a shell. No signs of data exfiltration, persistence, or other malicious behaviors were found.
Capability Assessment
Purpose & Capability
Name/description (convert narrated videos into steps and enriched outputs) aligns with the code and CLI commands. The required binaries (uv, ffmpeg, python3) are reasonable for a Python CLI that uses ffmpeg and the uv packaging/runtime. The included scripts, docker-compose, and model-bootstrapping are appropriate for a self-hosted model-backed pipeline.
Instruction Scope
SKILL.md and the CLI instruct the agent/operator to run transcription, chunking, extraction, frame sampling, enrichment, and markdown rendering. The instructions direct the tool to call configured model provider endpoints and to read and base64-encode image files (frames) and include them in model requests — expected for VLM-based enrichment but important to note: large binary image payloads will be sent to whatever provider URL is configured. SKILL.md contains a minor contradiction (claims 'no repo clone required' while showing commands that assume a local repo).
Install Mechanism
The registry lists no automated install spec (instruction-only). The bundle nevertheless contains many source files and helper scripts. This is not itself dangerous, but the provided scripts (scripts/bootstrap_models.sh) will download large model binaries from Hugging Face via the HF CLI and docker-compose references GHCR images — both reasonable for a local/self-hosted setup but require trust in those sources and will write substantial data to disk.
Credentials
The skill does not require any environment variables by default (config.example.json uses api_key_env:null). The code supports provider API keys if configured, which is appropriate for calling model endpoints. There are no unrelated credentials requested in the manifest. Note: if you set provider.api_key_env to point to an env var, that env var will be used to authenticate requests to the configured model endpoints — so only set keys for providers you trust.
Persistence & Privilege
always:false and no install spec means the skill does not request forced persistent inclusion or elevated platform privileges. It performs normal file I/O (reads/writes JSONL, writes frames/clips, runs ffmpeg subprocesses) and spawns subprocesses; this is expected behavior for a CLI tool of this scope.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install video-skill - After installation, invoke the skill by name or use
/video-skill - Provide required inputs per the skill's parameter spec and get structured output
Version History
v0.1.2
fix: don't mention github
v0.1.1
fix: requirements
v0.1.0
Initial public release
Metadata
Frequently Asked Questions
What is Video Skill?
Run the video-skill pipeline to convert narrated videos into structured step data and enriched timeline-ready outputs. Use when a user asks to process a vide... It is an AI Agent Skill for Claude Code / OpenClaw, with 360 downloads so far.
How do I install Video Skill?
Run "/install video-skill" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Video Skill free?
Yes, Video Skill is completely free (open-source). You can download, install and use it at no cost.
Which platforms does Video Skill support?
Video Skill is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Video Skill?
It is built and maintained by Michael Gold (@michaelgold); the current version is v0.1.2.
More Skills