← Back to Skills Marketplace

Video To Text

Name: Video To Text
Author: lkyyyy320

by Lkyyyy320 · GitHub ↗ · v0.1.0

cross-platform ⚠ suspicious

496

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install video-to-text-2

Description

Video to text converter. Downloads videos from Bilibili using bilibili-api, from other sites using yt-dlp, then transcribes audio using faster-whisper. Use w...

Usage Guidance

This skill appears to do what it says: download video/audio and transcribe it locally. Before using it: (1) review the script locally (it’s short and readable) and run in an isolated environment (virtualenv, container). (2) Be cautious when supplying Bilibili credentials: copy SESSDATA/bili_jct/buvid3 only from your browser and do not paste them into shared repos or logs; prefer passing them on the command line for ephemeral use. (3) Ensure yt-dlp and ffmpeg are installed from trusted sources because the script invokes yt-dlp as a subprocess. (4) Expect large model downloads and disk usage for medium/large faster-whisper models. If you want higher assurance, run the script on a machine/account where leaked cookies would have limited impact.

Capability Analysis

Type: OpenClaw Skill Name: video-to-text-2 Version: 0.1.0 The skill is classified as suspicious due to its use of `subprocess.run` to execute external commands (`yt-dlp`), which, while necessary for its stated purpose, introduces a potential for shell or argument injection if the OpenClaw agent does not properly sanitize user-provided URLs or file paths before execution. Additionally, the script handles sensitive Bilibili authentication credentials (SESSDATA, bili_jct, buvid3) by allowing them to be passed as command-line arguments or stored in a global variable within `scripts/video_to_text.py`, which is a vulnerability in credential management that could lead to exposure. The `SKILL.md` itself does not contain malicious prompt injection but outlines a `bash` command usage that could be exploited by an agent if user input is directly interpolated.

Capability Assessment

✓ Purpose & Capability

Name/description match the code and README: script downloads videos (bilibili-api or yt-dlp) and transcribes with faster-whisper. Required tools and libraries (yt-dlp, bilibili-api, ffmpeg, faster-whisper) are appropriate for the stated task.

ℹ Instruction Scope

SKILL.md and the script limit actions to downloading audio and transcribing. The README instructs users to extract Bilibili cookies via browser DevTools — this is necessary for authenticated Bilibili downloads but is sensitive; the skill does not attempt to transmit credentials anywhere else. The script runs yt-dlp via subprocess and performs HTTP GETs for audio URLs (expected).

✓ Install Mechanism

No automated install spec; instructions tell user to pip install listed packages and ensure ffmpeg. This is low-risk and transparent.

ℹ Credentials

The skill does not request environment variables or external credentials by default. It does require Bilibili session cookies for downloading private/age-restricted content; those are provided via CLI or editing the script. Requesting these specific cookies is proportionate to Bilibili access, but storing credentials in the script or exposing them carelessly is a security risk.

✓ Persistence & Privilege

Skill is not always-enabled, does not request special platform privileges, and does not modify other skills or system-wide settings.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install video-to-text-2
After installation, invoke the skill by name or use /video-to-text-2
Provide required inputs per the skill's parameter spec and get structured output

Version History

v0.1.0

Initial release of video-to-text: convert online or local videos to text. - Download videos from Bilibili (with auth) or any yt-dlp supported site, or use local files. - Transcribe video audio to text using faster-whisper with multiple model options. - Supports language selection, output to file, and optional keeping of media files. - Includes usage examples, auth setup details for Bilibili, and installation instructions.

Metadata

Slug video-to-text-2

Version 0.1.0

License —

All-time Installs 2

Active Installs 2

Total Versions 1

Frequently Asked Questions

What is Video To Text?

Video to text converter. Downloads videos from Bilibili using bilibili-api, from other sites using yt-dlp, then transcribes audio using faster-whisper. Use w... It is an AI Agent Skill for Claude Code / OpenClaw, with 496 downloads so far.

How do I install Video To Text?

Run "/install video-to-text-2" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Video To Text free?

Yes, Video To Text is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Video To Text support?

Video To Text is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Video To Text?

It is built and maintained by Lkyyyy320 (@lkyyyy320); the current version is v0.1.0.

More Skills