/install faster-whisper-local-service
Faster Whisper Local Service
Provision a local STT backend used by voice skills.
What this sets up
- Python venv for faster-whisper
transcribe-server.pyHTTP endpoint athttp://127.0.0.1:18790/transcribe- systemd user service:
openclaw-transcribe.service
Important: Model download on first run
On first startup, faster-whisper downloads model weights from Hugging Face (~1.5 GB for medium). This requires internet access and disk space. After the initial download, models are cached locally and the service runs fully offline.
| Model | Download size | RAM usage |
|---|---|---|
| tiny | ~75 MB | ~400 MB |
| base | ~150 MB | ~500 MB |
| small | ~500 MB | ~800 MB |
| medium | ~1.5 GB | ~1.4 GB |
| large-v3 | ~3.0 GB | ~3.5 GB |
To pre-download models in an air-gapped environment, see faster-whisper docs.
Security notes
Network isolation
- Binds to
127.0.0.1only — not reachable from the network. - CORS restricted to a single origin (
https://127.0.0.1:8443by default). - No credentials, API keys, or secrets are used or stored.
Input validation
- Upload size limit: Requests exceeding the configured limit are rejected before processing (HTTP 413). Default: 50 MB, configurable via
MAX_UPLOAD_MB. - Magic-byte check: Only files with recognized audio signatures (WAV, OGG, FLAC, MP3, WebM, M4A) are accepted. Unrecognized formats are rejected (HTTP 415) before reaching GStreamer.
- Subprocess safety: All arguments to
gst-launch-1.0are passed as a list — no shell expansion or injection is possible.
GStreamer dependency
The service uses GStreamer's decodebin for audio format conversion. Like any media library, GStreamer's parsers process binary data and should be kept up to date. Mitigation: install gst-launch-1.0 from your OS vendor's trusted packages and apply security updates regularly. The magic-byte pre-filter above reduces the attack surface by rejecting non-audio payloads before they reach GStreamer.
No data exfiltration
- No outbound network calls (after initial model download).
- No telemetry, analytics, or phone-home behavior.
- Temporary files are created in a per-request
TemporaryDirectoryand cleaned up immediately.
Reproducibility defaults
- Pinned package:
faster-whisper==1.1.1(override via env) - Explicit dependency check for
gst-launch-1.0 - CORS restricted to one origin by default
- Configurable workspace/service paths (no hardcoded user path)
Deploy
bash scripts/deploy.sh
With custom settings:
WORKSPACE=~/.openclaw/workspace \
TRANSCRIBE_PORT=18790 \
WHISPER_MODEL_SIZE=medium \
WHISPER_LANGUAGE=auto \
TRANSCRIBE_ALLOWED_ORIGIN=https://10.0.0.42:8443 \
bash scripts/deploy.sh
Language setting
Default: auto (auto-detect language). Set WHISPER_LANGUAGE=de for German-only, en for English-only, etc. Fixed language is faster and more accurate if you only use one language.
Idempotent: safe to run repeatedly.
What this skill modifies
| What | Path | Action |
|---|---|---|
| Python venv | $WORKSPACE/.venv-faster-whisper/ |
Creates venv, installs faster-whisper via pip |
| Transcribe server | $WORKSPACE/voice-input/transcribe-server.py |
Writes server script |
| Systemd service | ~/.config/systemd/user/openclaw-transcribe.service |
Creates + enables persistent service |
| Model cache | ~/.cache/huggingface/ |
Downloads model weights on first run |
Uninstall
systemctl --user stop openclaw-transcribe.service
systemctl --user disable openclaw-transcribe.service
rm -f ~/.config/systemd/user/openclaw-transcribe.service
systemctl --user daemon-reload
Optional full cleanup:
rm -rf ~/.openclaw/workspace/.venv-faster-whisper
rm -f ~/.openclaw/workspace/voice-input/transcribe-server.py
Verify
bash scripts/status.sh
Expected:
- service
active - endpoint responds (HTTP 200/500 acceptable for invalid sample payload)
Notes
- This skill provides backend transcription only.
- Pair with
webchat-voice-proxyfor browser mic + HTTPS/WSS integration. - For one-step install, use
webchat-voice-full-stack(deploys backend + proxy in order).
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install faster-whisper-local-service - 安装完成后,直接呼叫该 Skill 的名称或使用
/faster-whisper-local-service触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
Faster Whisper Local Service 是什么?
OpenClaw local speech-to-text backend using faster-whisper over HTTP on 127.0.0.1:18790. Use when you want voice transcription without external APIs, without... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 1397 次。
如何安装 Faster Whisper Local Service?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install faster-whisper-local-service」即可一键安装,无需额外配置。
Faster Whisper Local Service 是免费的吗?
是的,Faster Whisper Local Service 完全免费(开源免费),可自由下载、安装和使用。
Faster Whisper Local Service 支持哪些平台?
Faster Whisper Local Service 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Faster Whisper Local Service?
由 neldar(@neldar)开发并维护,当前版本 v0.2.0。