← 返回 Skills 市场

Video To Text

Name: Video To Text
Author: sxliuyu

作者 SxLiuYu · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ⚠ suspicious

213

总下载

当前安装

版本数

在 OpenClaw 中安装

/install video-transcribe-pro

功能描述

Convert video or audio files from URLs into text or subtitle formats using a free API with automatic language detection and no local downloads required.

安全使用建议

This skill will download the media you provide to a temp file and upload it to an external service (https://api.myshell.ai). That behavior is necessary for remote transcription but contradicts the 'no local downloads' claim in the description and exposes your media to a third party whose privacy policy and trustworthiness are unknown. Before installing or using: (1) avoid sending sensitive or private media to this skill, (2) verify the myshell.ai endpoint and operator and their privacy/retention policies, (3) if you need local-only processing, prefer using the included Python script with a vetted local Whisper/ffmpeg installation (be aware that requires installing software and models), and (4) run the skill in a restricted/sandboxed environment if you must test it. If the misleading description (no local download/no key) is a concern, contact the author or prefer a transcription skill that clearly documents data flow and required credentials.

功能分析

Type: OpenClaw Skill Name: video-transcribe-pro Version: 1.0.0 The skill contains a critical command injection vulnerability in tool.js, where the 'url' parameter is concatenated into a shell command string and executed via execSync without sanitization. While the core logic in index.js and SKILL.md aligns with the stated purpose of transcribing media via the MyShell API (api.myshell.ai), the insecure handling of user input allows for arbitrary remote code execution (RCE) on the host system.

能力评估

⚠ Purpose & Capability

The SKILL.md and metadata emphasize a free, no-local-download, no-API-key Whisper API. The actual code (index.js/tool.js) downloads the provided URL into a temp file and then uploads the file to a third‑party endpoint (CONFIG.primaryApi = https://api.myshell.ai/...). The repository also includes a Python script that supports local Whisper/ffmpeg and AssemblyAI (which requires an API key). Requiring local downloads (to temp) contradicts the 'no local downloads required' claim; presence of multiple fallback mechanisms (some requiring keys) is not explained in the description.

⚠ Instruction Scope

Runtime instructions and code will: fetch user-provided URLs, write the content to a temp file, and transmit the file contents to an external service (myshell.ai). That network upload is expected for a transcription skill, but the SKILL.md's phrasing ('no local downloads required') is misleading. The skill will therefore exfiltrate the media to an external third party; SKILL.md does not make clear the privacy/security implications or ownership of that third party. The Python script supports local processing, ffmpeg, and other APIs but these are not required or documented as alternate flows in the top-level description.

ℹ Install Mechanism

No install spec (instruction-only) is present, so nothing is installed automatically — lower install risk. However the package includes runnable code (node scripts and a Python script) that invoke external binaries (ffmpeg) and ship network requests; if the user or agent runs the included scripts they need node and possibly Python/ffmpeg/whisper. There is no download-from-suspicious-URL install step, which is good.

ℹ Credentials

The skill does not require environment variables or credentials to run the primary path. However the code contains optional branches that reference external services requiring API keys (OpenAI, AssemblyAI) and a local whisper flow which requires Python packages and ffmpeg; those are optional but not clearly documented in SKILL.md as alternative modes. Primary API (myshell.ai) is used without a key — you should verify and trust that endpoint before sending sensitive media.

✓ Persistence & Privilege

The skill does not request persistent/always-on privileges, does not modify other skills, and does not request system-level configuration. It runs as a tool via child process (execSync) which is normal for wrappers, but executing bundled scripts means the agent will run code on the host when invoked.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install video-transcribe-pro
安装完成后，直接呼叫该 Skill 的名称或使用 /video-transcribe-pro 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

video-transcribe-pro 1.0.0 – Initial Release - Provides free API for converting video/audio files to text or subtitles, no download required. - Activates on a range of trigger keywords, including both Chinese and English terms relating to transcription and subtitles. - Main tool: video_to_text, supporting various formats (mp4, wav, mp3, etc.), language auto-detection, and output as plain text or SRT. - Uses Whisper API for speech recognition, with fallback to alternate endpoints if needed. - Supports files up to 25MB; usage examples and best practices provided in documentation.

元数据

Slug video-transcribe-pro

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

Video To Text 是什么？

Convert video or audio files from URLs into text or subtitle formats using a free API with automatic language detection and no local downloads required. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 213 次。

如何安装 Video To Text？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install video-transcribe-pro」即可一键安装，无需额外配置。

Video To Text 是免费的吗？

是的，Video To Text 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

Video To Text 支持哪些平台？

Video To Text 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Video To Text？

由 SxLiuYu（@sxliuyu）开发并维护，当前版本 v1.0.0。