← 返回 Skills 市场
sfkiwi

Local Vosk STT

作者 Mike Sutherland · GitHub ↗ · v1.0.1
cross-platform ⚠ suspicious
1073
总下载
0
收藏
2
当前安装
2
版本数
在 OpenClaw 中安装
/install local-vosk
功能描述
Local speech-to-text using Vosk. Lightweight, fast, fully offline. Perfect for transcribing Telegram voice messages, audio files, or any speech-to-text task without cloud APIs.
使用说明 (SKILL.md)

Local Vosk STT

Lightweight local speech-to-text using Vosk. Fully offline after model download.

Use Cases

  • Telegram voice messages — transcribe .ogg voice notes automatically
  • Audio files — any format ffmpeg supports
  • Offline transcription — no API keys, no cloud, no costs

Quick Start

# Transcribe Telegram voice message
./skills/local-vosk/scripts/transcribe voice_message.ogg

# Transcribe any audio
./skills/local-vosk/scripts/transcribe audio.mp3

# With language (default: en-us)
./skills/local-vosk/scripts/transcribe audio.wav --lang en-us

Supported Formats

Any format ffmpeg can decode: ogg (Telegram), mp3, wav, m4a, webm, flac, etc.

Models

Default model: vosk-model-small-en-us-0.15 (~40MB)

Other models available at https://alphacephei.com/vosk/models

Setup (if not installed)

pip3 install vosk --user --break-system-packages

# Download model
mkdir -p ~/vosk-models && cd ~/vosk-models
wget https://alphacephei.com/vosk/models/vosk-model-small-en-us-0.15.zip
unzip vosk-model-small-en-us-0.15.zip

Notes

  • Quality is good for conversational speech
  • For higher accuracy, use larger models or faster-whisper
  • Processes audio at ~10x realtime on typical hardware
  • Telegram voice messages are .ogg format — works out of the box
安全使用建议
Don't install or run this skill as-is. SKILL.md expects a local script at ./skills/local-vosk/scripts/transcribe, but the package contains no code files — ask the publisher for the missing scripts or a corrected package. If you plan to run the provided setup commands yourself: ensure ffmpeg is installed (the README mentions it but the skill doesn't declare it), verify the model download source and checksums, and avoid running pip with unexplained flags like --break-system-packages unless you know what they do. Prefer a packaged release (includes the transcribe script) or run Vosk in an isolated environment/container until the skill's files and provenance are confirmed.
功能分析
Type: OpenClaw Skill Name: local-vosk Version: 1.0.1 The skill is classified as suspicious due to the use of `wget` to download external content from `https://alphacephei.com/vosk/models` and `pip3 install vosk --user --break-system-packages` for system modification, both found in SKILL.md. While the stated purpose is benign (local speech-to-text) and the sources appear legitimate, these actions involve external network calls and system-level package management with a flag (`--break-system-packages`) that allows potentially disruptive modifications. These capabilities, if exploited or if the external source were compromised, could pose a supply chain risk or system integrity issues, thus exceeding the 'benign' threshold for a security review.
能力评估
Purpose & Capability
The description (local offline STT) matches the instructions (use vosk, download models). However SKILL.md instructs running ./skills/local-vosk/scripts/transcribe which implies bundled scripts/code that are not present in the package. Also the doc expects ffmpeg for decoding audio but the skill declares no required binaries. These gaps are disproportionate to the stated purpose.
Instruction Scope
Instructions tell the agent/user to run a local script path and to pip-install vosk and download models. Because there are no code files, an agent following these instructions would fail or attempt to run non-existent scripts. The instructions reference system actions (pip install, wget, unzip, writing to ~/vosk-models) that are reasonable for setup but include the unusual pip flag --break-system-packages without explanation.
Install Mechanism
There is no formal install spec (instruction-only), which is lower risk. The manual install commands point to a legitimate upstream site (alphacephei.com) for models and use pip/wget/unzip. Those sources are expected for Vosk models; no high-risk download URLs or shorteners are used. Still, because the skill lacks bundled code, it's unclear what the referenced scripts would do when present.
Credentials
The skill requests no environment variables or credentials, which is appropriate for an offline STT tool. No unrelated secrets are requested.
Persistence & Privilege
The skill does not request always:true and does not claim to modify other skills or system settings. It appears to be an on-demand instruction-only skill.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install local-vosk
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /local-vosk 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.1
Added Telegram voice message use case + improved docs
v1.0.0
Initial release - lightweight offline speech-to-text using Vosk
元数据
Slug local-vosk
版本 1.0.1
许可证
累计安装 2
当前安装数 2
历史版本数 2
常见问题

Local Vosk STT 是什么?

Local speech-to-text using Vosk. Lightweight, fast, fully offline. Perfect for transcribing Telegram voice messages, audio files, or any speech-to-text task without cloud APIs. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 1073 次。

如何安装 Local Vosk STT?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install local-vosk」即可一键安装,无需额外配置。

Local Vosk STT 是免费的吗?

是的,Local Vosk STT 完全免费(开源免费),可自由下载、安装和使用。

Local Vosk STT 支持哪些平台?

Local Vosk STT 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Local Vosk STT?

由 Mike Sutherland(@sfkiwi)开发并维护,当前版本 v1.0.1。

💬 留言讨论