← Back to Skills Marketplace
sfkiwi

Local Vosk STT

by Mike Sutherland · GitHub ↗ · v1.0.1
cross-platform ⚠ suspicious
1073
Downloads
0
Stars
2
Active Installs
2
Versions
Install in OpenClaw
/install local-vosk
Description
Local speech-to-text using Vosk. Lightweight, fast, fully offline. Perfect for transcribing Telegram voice messages, audio files, or any speech-to-text task without cloud APIs.
README (SKILL.md)

Local Vosk STT

Lightweight local speech-to-text using Vosk. Fully offline after model download.

Use Cases

  • Telegram voice messages — transcribe .ogg voice notes automatically
  • Audio files — any format ffmpeg supports
  • Offline transcription — no API keys, no cloud, no costs

Quick Start

# Transcribe Telegram voice message
./skills/local-vosk/scripts/transcribe voice_message.ogg

# Transcribe any audio
./skills/local-vosk/scripts/transcribe audio.mp3

# With language (default: en-us)
./skills/local-vosk/scripts/transcribe audio.wav --lang en-us

Supported Formats

Any format ffmpeg can decode: ogg (Telegram), mp3, wav, m4a, webm, flac, etc.

Models

Default model: vosk-model-small-en-us-0.15 (~40MB)

Other models available at https://alphacephei.com/vosk/models

Setup (if not installed)

pip3 install vosk --user --break-system-packages

# Download model
mkdir -p ~/vosk-models && cd ~/vosk-models
wget https://alphacephei.com/vosk/models/vosk-model-small-en-us-0.15.zip
unzip vosk-model-small-en-us-0.15.zip

Notes

  • Quality is good for conversational speech
  • For higher accuracy, use larger models or faster-whisper
  • Processes audio at ~10x realtime on typical hardware
  • Telegram voice messages are .ogg format — works out of the box
Usage Guidance
Don't install or run this skill as-is. SKILL.md expects a local script at ./skills/local-vosk/scripts/transcribe, but the package contains no code files — ask the publisher for the missing scripts or a corrected package. If you plan to run the provided setup commands yourself: ensure ffmpeg is installed (the README mentions it but the skill doesn't declare it), verify the model download source and checksums, and avoid running pip with unexplained flags like --break-system-packages unless you know what they do. Prefer a packaged release (includes the transcribe script) or run Vosk in an isolated environment/container until the skill's files and provenance are confirmed.
Capability Analysis
Type: OpenClaw Skill Name: local-vosk Version: 1.0.1 The skill is classified as suspicious due to the use of `wget` to download external content from `https://alphacephei.com/vosk/models` and `pip3 install vosk --user --break-system-packages` for system modification, both found in SKILL.md. While the stated purpose is benign (local speech-to-text) and the sources appear legitimate, these actions involve external network calls and system-level package management with a flag (`--break-system-packages`) that allows potentially disruptive modifications. These capabilities, if exploited or if the external source were compromised, could pose a supply chain risk or system integrity issues, thus exceeding the 'benign' threshold for a security review.
Capability Assessment
Purpose & Capability
The description (local offline STT) matches the instructions (use vosk, download models). However SKILL.md instructs running ./skills/local-vosk/scripts/transcribe which implies bundled scripts/code that are not present in the package. Also the doc expects ffmpeg for decoding audio but the skill declares no required binaries. These gaps are disproportionate to the stated purpose.
Instruction Scope
Instructions tell the agent/user to run a local script path and to pip-install vosk and download models. Because there are no code files, an agent following these instructions would fail or attempt to run non-existent scripts. The instructions reference system actions (pip install, wget, unzip, writing to ~/vosk-models) that are reasonable for setup but include the unusual pip flag --break-system-packages without explanation.
Install Mechanism
There is no formal install spec (instruction-only), which is lower risk. The manual install commands point to a legitimate upstream site (alphacephei.com) for models and use pip/wget/unzip. Those sources are expected for Vosk models; no high-risk download URLs or shorteners are used. Still, because the skill lacks bundled code, it's unclear what the referenced scripts would do when present.
Credentials
The skill requests no environment variables or credentials, which is appropriate for an offline STT tool. No unrelated secrets are requested.
Persistence & Privilege
The skill does not request always:true and does not claim to modify other skills or system settings. It appears to be an on-demand instruction-only skill.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install local-vosk
  3. After installation, invoke the skill by name or use /local-vosk
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.1
Added Telegram voice message use case + improved docs
v1.0.0
Initial release - lightweight offline speech-to-text using Vosk
Metadata
Slug local-vosk
Version 1.0.1
License
All-time Installs 2
Active Installs 2
Total Versions 2
Frequently Asked Questions

What is Local Vosk STT?

Local speech-to-text using Vosk. Lightweight, fast, fully offline. Perfect for transcribing Telegram voice messages, audio files, or any speech-to-text task without cloud APIs. It is an AI Agent Skill for Claude Code / OpenClaw, with 1073 downloads so far.

How do I install Local Vosk STT?

Run "/install local-vosk" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Local Vosk STT free?

Yes, Local Vosk STT is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Local Vosk STT support?

Local Vosk STT is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Local Vosk STT?

It is built and maintained by Mike Sutherland (@sfkiwi); the current version is v1.0.1.

💬 Comments