← Back to Skills Marketplace
7592
Downloads
5
Stars
44
Active Installs
20
Versions
Install in OpenClaw
/install faster-whisper
Description
Local speech-to-text using faster-whisper. 4-6x faster than OpenAI Whisper with identical accuracy; GPU acceleration enables ~20x realtime transcription. SRT...
Usage Guidance
Install only if you are comfortable with a local ML tool that installs Python dependencies and may download models. Use URL/RSS transcription only for media you intend to fetch, avoid pasting Hugging Face tokens into shared logs, choose output paths carefully because files can be overwritten, and avoid opening HTML transcript reports generated from untrusted audio or filenames until the HTML escaping issue is fixed.
Capability Analysis
Type: OpenClaw Skill
Name: faster-whisper
Version: 1.5.1
This skill is classified as suspicious due to its broad capabilities, which include downloading content from arbitrary URLs via `yt-dlp`, executing `ffmpeg` for audio/video processing and subtitle burning, and performing self-updates of its core dependency. While these actions are plausibly aligned with the stated purpose of a comprehensive transcription tool, they grant significant access to the network and local file system. The `SKILL.md` agent guidance does not contain any malicious prompt injection attempts, and the `setup.sh` and `scripts/transcribe.py` files implement these powerful features using `subprocess.run()` with argument lists, which mitigates direct shell injection vulnerabilities. However, the inherent power of these operations, even when used for legitimate purposes, elevates the risk profile beyond a purely benign classification.
Capability Assessment
Purpose & Capability
The advertised transcription, subtitles, diarization, URL/RSS download, batch processing, and export features match the included setup script and Python CLI.
Instruction Scope
The agent guidance generally tells agents to add higher-impact flags only when the user asks, though the trigger list is broad and HTML output is not documented as needing sanitization caution.
Install Mechanism
Setup creates a local virtual environment and installs Python ML packages, and update flags can upgrade faster-whisper in that environment; this is disclosed and user-invoked, not hidden or automatic.
Credentials
Network access through yt-dlp/RSS, ffmpeg processing, local file reads/writes, model cache use, and optional Hugging Face token access are proportionate to transcription and diarization.
Persistence & Privilege
Persistence is limited to the skill virtual environment, model/dependency caches, temporary downloads, and user-specified output files; no background service, privilege escalation, or unrelated persistence was found.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install faster-whisper - After installation, invoke the skill by name or use
/faster-whisper - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.5.1
- Fixed --skip-existing in multi-format mode to check ALL format outputs before skipping
- Fixed --no-timestamps conflict check missing lrc, ass, ttml formats
- Fixed --speaker-names silently doing nothing without --diarize; now prints a warning
- Batch summary now shows skipped file count when --skip-existing is active
v1.5.0
- docs: update default model from distil-large-v3 to distil-large-v3.5
- fix: setup.sh --check hangfix + skill.json ffmpeg optional
- fix(transcribe): clean-filler word list, fuzzy search tokens, URL temp cleanup
- fix(multi-format): create output dir in single-file mode
- feat: add CSV output, language-map, batch ETA estimate
- feat: add TTML output, transcript search, chapter detection, speaker audio export
- feat: distil auto-condition, log-level, ffmpeg clarification
- fix: rename --without-timestamps to --no-timestamps
- feat: add 17 new features — upstream params + LRC/detect-language/merge-sentences/stats/stdin/template
v1.4.5
- Fix author field to match GitHub username (ThePlasmak)
v1.4.4
- Declare yt-dlp and HuggingFace token as optional dependencies in skill.json
- Sync SKILL.md frontmatter version and author with skill.json
v1.4.3
- Auto-run wav2vec2 alignment whenever word timestamps are computed
- Remove --precise flag (alignment is automatic, flag kept as hidden compat alias)
- Alignment triggers for --word-timestamps, --diarize, --min-confidence
- No overhead for basic transcription (fast path unchanged)
v1.4.1
- Add --precise flag for wav2vec2 forced alignment (~10ms word accuracy)
- Uses torchaudio MMS model (multilingual, cached for batch processing)
- Runs before diarization when combined (improves speaker assignment)
- Install torchaudio alongside torch in setup.sh
v1.3.0
- Add SRT and VTT subtitle output formats (--format srt/vtt)
- Add speaker diarization via pyannote.audio (--diarize) with word-level accuracy
- Add URL/YouTube input with auto yt-dlp download
- Add batch processing with glob patterns, directories, and --skip-existing
- Add initial prompt support for domain terminology (--initial-prompt)
- Add confidence-based segment filtering (--min-confidence)
- Add performance stats after each transcription (duration, realtime factor)
- Unify output under --format flag (text/json/srt/vtt), keep --json for backward compat
- Add agent guidance for minimal invocation (don't load unused features)
v1.2.0
- Default model changed to distil-large-v3.5 (lower WER: 7.08 vs 7.53, same speed as v3)
- Trained on 4x more data (98k hours) with improved robustness
v1.1.0
- Use BatchedInferencePipeline by default (~3x faster; 69s → 23s on 21-min file with distil-large-v3)
- VAD enabled by default in batched mode
- Add --batch-size option (default: 8; reduce if OOM)
- Add --no-batch flag to fall back to standard WhisperModel
- Add --hotwords support for boosting recognition of specific terms
- Bump tested version: faster-whisper 1.2.1
v1.0.12
- Fix skill title display on ClawdHub
v1.0.11
- Prefer distil-large-v3 over large-v3-turbo as the recommended model
v1.0.9
- docs: rebrand from Moltbot/MoltHub to OpenClaw/ClawHub
v1.0.7
- Removed Windows-native references from SKILL.md (setup.ps1, transcribe.cmd, winget) since ClawHub cannot distribute .ps1/.cmd files
- Windows users should use WSL2 or get Windows scripts from the GitHub repo directly
v1.0.6
- Added .clawdhubignore to exclude README.md, CHANGELOG.md, LICENSE from published package
- Fixed requires.bins in skill.json (python3, ffmpeg)
- Added platforms field to skill.json
- Updated metadata key from moltbot to openclaw in SKILL.md
v1.0.5
Fix metadata: add requires.bins to skill.json, add platforms, update moltbot to openclaw in SKILL.md
v1.0.4
- Fixed skill title and metadata
- Removed development files from published package
v1.0.3
Fix skill title (was 'Faster Whisper Clean' due to temp file naming)
v1.0.2
- Improve skill discovery, error handling, some copyediting
- Add skill.json
- Edit README to reduce confusion as Moltbot may refer to the service
v1.0.1
Remove install metadata (as ClawdHub's install section is confusing); add python3 to required binaries
v1.0.0
Initial public release of faster-whisper.
- Local speech-to-text using faster-whisper (CTranslate2 backend), ~4-6x faster than OpenAI Whisper, with identical accuracy.
- Supports GPU acceleration for ~20x realtime transcription; automatic hardware detection and setup for Windows.
- Offers both standard and distilled models, with selectable accuracy/speed tradeoffs and word-level timestamps.
- Cross-platform: Windows (including WSL2), Linux, and macOS (Apple Silicon supported).
- Setup scripts provided for all platforms, including automatic installation of dependencies and GPU support where possible.
- Includes extensive usage documentation, quick-start commands, model selection guide, and troubleshooting tips.
Metadata
Frequently Asked Questions
What is Faster Whisper?
Local speech-to-text using faster-whisper. 4-6x faster than OpenAI Whisper with identical accuracy; GPU acceleration enables ~20x realtime transcription. SRT... It is an AI Agent Skill for Claude Code / OpenClaw, with 7592 downloads so far.
How do I install Faster Whisper?
Run "/install faster-whisper" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Faster Whisper free?
Yes, Faster Whisper is completely free (open-source). You can download, install and use it at no cost.
Which platforms does Faster Whisper support?
Faster Whisper is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Faster Whisper?
It is built and maintained by Sarah Mak (@theplasmak); the current version is v1.5.1.
More Skills