← 返回 Skills 市场
theplasmak

Faster Whisper

作者 Sarah Mak · GitHub ↗ · v1.5.1
cross-platform ✓ 安全检测通过
7592
总下载
5
收藏
44
当前安装
20
版本数
在 OpenClaw 中安装
/install faster-whisper
功能描述
Local speech-to-text using faster-whisper. 4-6x faster than OpenAI Whisper with identical accuracy; GPU acceleration enables ~20x realtime transcription. SRT...
安全使用建议
Install only if you are comfortable with a local ML tool that installs Python dependencies and may download models. Use URL/RSS transcription only for media you intend to fetch, avoid pasting Hugging Face tokens into shared logs, choose output paths carefully because files can be overwritten, and avoid opening HTML transcript reports generated from untrusted audio or filenames until the HTML escaping issue is fixed.
功能分析
Type: OpenClaw Skill Name: faster-whisper Version: 1.5.1 This skill is classified as suspicious due to its broad capabilities, which include downloading content from arbitrary URLs via `yt-dlp`, executing `ffmpeg` for audio/video processing and subtitle burning, and performing self-updates of its core dependency. While these actions are plausibly aligned with the stated purpose of a comprehensive transcription tool, they grant significant access to the network and local file system. The `SKILL.md` agent guidance does not contain any malicious prompt injection attempts, and the `setup.sh` and `scripts/transcribe.py` files implement these powerful features using `subprocess.run()` with argument lists, which mitigates direct shell injection vulnerabilities. However, the inherent power of these operations, even when used for legitimate purposes, elevates the risk profile beyond a purely benign classification.
能力评估
Purpose & Capability
The advertised transcription, subtitles, diarization, URL/RSS download, batch processing, and export features match the included setup script and Python CLI.
Instruction Scope
The agent guidance generally tells agents to add higher-impact flags only when the user asks, though the trigger list is broad and HTML output is not documented as needing sanitization caution.
Install Mechanism
Setup creates a local virtual environment and installs Python ML packages, and update flags can upgrade faster-whisper in that environment; this is disclosed and user-invoked, not hidden or automatic.
Credentials
Network access through yt-dlp/RSS, ffmpeg processing, local file reads/writes, model cache use, and optional Hugging Face token access are proportionate to transcription and diarization.
Persistence & Privilege
Persistence is limited to the skill virtual environment, model/dependency caches, temporary downloads, and user-specified output files; no background service, privilege escalation, or unrelated persistence was found.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install faster-whisper
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /faster-whisper 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.5.1
- Fixed --skip-existing in multi-format mode to check ALL format outputs before skipping - Fixed --no-timestamps conflict check missing lrc, ass, ttml formats - Fixed --speaker-names silently doing nothing without --diarize; now prints a warning - Batch summary now shows skipped file count when --skip-existing is active
v1.5.0
- docs: update default model from distil-large-v3 to distil-large-v3.5 - fix: setup.sh --check hangfix + skill.json ffmpeg optional - fix(transcribe): clean-filler word list, fuzzy search tokens, URL temp cleanup - fix(multi-format): create output dir in single-file mode - feat: add CSV output, language-map, batch ETA estimate - feat: add TTML output, transcript search, chapter detection, speaker audio export - feat: distil auto-condition, log-level, ffmpeg clarification - fix: rename --without-timestamps to --no-timestamps - feat: add 17 new features — upstream params + LRC/detect-language/merge-sentences/stats/stdin/template
v1.4.5
- Fix author field to match GitHub username (ThePlasmak)
v1.4.4
- Declare yt-dlp and HuggingFace token as optional dependencies in skill.json - Sync SKILL.md frontmatter version and author with skill.json
v1.4.3
- Auto-run wav2vec2 alignment whenever word timestamps are computed - Remove --precise flag (alignment is automatic, flag kept as hidden compat alias) - Alignment triggers for --word-timestamps, --diarize, --min-confidence - No overhead for basic transcription (fast path unchanged)
v1.4.1
- Add --precise flag for wav2vec2 forced alignment (~10ms word accuracy) - Uses torchaudio MMS model (multilingual, cached for batch processing) - Runs before diarization when combined (improves speaker assignment) - Install torchaudio alongside torch in setup.sh
v1.3.0
- Add SRT and VTT subtitle output formats (--format srt/vtt) - Add speaker diarization via pyannote.audio (--diarize) with word-level accuracy - Add URL/YouTube input with auto yt-dlp download - Add batch processing with glob patterns, directories, and --skip-existing - Add initial prompt support for domain terminology (--initial-prompt) - Add confidence-based segment filtering (--min-confidence) - Add performance stats after each transcription (duration, realtime factor) - Unify output under --format flag (text/json/srt/vtt), keep --json for backward compat - Add agent guidance for minimal invocation (don't load unused features)
v1.2.0
- Default model changed to distil-large-v3.5 (lower WER: 7.08 vs 7.53, same speed as v3) - Trained on 4x more data (98k hours) with improved robustness
v1.1.0
- Use BatchedInferencePipeline by default (~3x faster; 69s → 23s on 21-min file with distil-large-v3) - VAD enabled by default in batched mode - Add --batch-size option (default: 8; reduce if OOM) - Add --no-batch flag to fall back to standard WhisperModel - Add --hotwords support for boosting recognition of specific terms - Bump tested version: faster-whisper 1.2.1
v1.0.12
- Fix skill title display on ClawdHub
v1.0.11
- Prefer distil-large-v3 over large-v3-turbo as the recommended model
v1.0.9
- docs: rebrand from Moltbot/MoltHub to OpenClaw/ClawHub
v1.0.7
- Removed Windows-native references from SKILL.md (setup.ps1, transcribe.cmd, winget) since ClawHub cannot distribute .ps1/.cmd files - Windows users should use WSL2 or get Windows scripts from the GitHub repo directly
v1.0.6
- Added .clawdhubignore to exclude README.md, CHANGELOG.md, LICENSE from published package - Fixed requires.bins in skill.json (python3, ffmpeg) - Added platforms field to skill.json - Updated metadata key from moltbot to openclaw in SKILL.md
v1.0.5
Fix metadata: add requires.bins to skill.json, add platforms, update moltbot to openclaw in SKILL.md
v1.0.4
- Fixed skill title and metadata - Removed development files from published package
v1.0.3
Fix skill title (was 'Faster Whisper Clean' due to temp file naming)
v1.0.2
- Improve skill discovery, error handling, some copyediting - Add skill.json - Edit README to reduce confusion as Moltbot may refer to the service
v1.0.1
Remove install metadata (as ClawdHub's install section is confusing); add python3 to required binaries
v1.0.0
Initial public release of faster-whisper. - Local speech-to-text using faster-whisper (CTranslate2 backend), ~4-6x faster than OpenAI Whisper, with identical accuracy. - Supports GPU acceleration for ~20x realtime transcription; automatic hardware detection and setup for Windows. - Offers both standard and distilled models, with selectable accuracy/speed tradeoffs and word-level timestamps. - Cross-platform: Windows (including WSL2), Linux, and macOS (Apple Silicon supported). - Setup scripts provided for all platforms, including automatic installation of dependencies and GPU support where possible. - Includes extensive usage documentation, quick-start commands, model selection guide, and troubleshooting tips.
元数据
Slug faster-whisper
版本 1.5.1
许可证
累计安装 254
当前安装数 44
历史版本数 20
常见问题

Faster Whisper 是什么?

Local speech-to-text using faster-whisper. 4-6x faster than OpenAI Whisper with identical accuracy; GPU acceleration enables ~20x realtime transcription. SRT... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 7592 次。

如何安装 Faster Whisper?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install faster-whisper」即可一键安装,无需额外配置。

Faster Whisper 是免费的吗?

是的,Faster Whisper 完全免费(开源免费),可自由下载、安装和使用。

Faster Whisper 支持哪些平台?

Faster Whisper 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Faster Whisper?

由 Sarah Mak(@theplasmak)开发并维护,当前版本 v1.5.1。

💬 留言讨论