← Back to Skills Marketplace

STT Recognizer | STT 识别器

Name: STT Recognizer | STT 识别器
Author: moroiser

by Morois · GitHub ↗ · v1.0.8 · MIT-0

cross-platform ✓ Security Clean

218

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install stt-recognizer

Description

语音转文字（Speech-to-Text / STT）工具。支持从麦克风录音，使用 Whisper（faster-whisper）在本地进行语音转文字，或通过 OpenAI 兼容 API 进行云端转写。触发词：录音、语音转文字、STT、语音识别、转写、录音转文字。适用平台：Linux / Windows...

Usage Guidance

This skill appears to do what it says: record from your microphone and transcribe locally or via an OpenAI‑compatible API. Before installing and running it: - If you plan to use API mode, only set STT_API_URL/STT_API_KEY for a trusted provider — audio will be uploaded to that endpoint. Keep keys secret. - The Python requirements include torch and Whisper implementations; install in a virtualenv/conda environment rather than system Python to avoid altering system packages (the quickstart suggests --break-system-packages which can be disruptive). - Model downloads are large (hundreds of MB to multiple GB) and will be stored under ~/.cache/huggingface/modules/stt-recognizer — ensure you have disk space and bandwidth. - The scripts access your microphone and save recordings under ~/.openclaw/workspace/projects/stt-recognizer/recordings (privacy consideration). If you want to avoid saving raw audio, inspect/modify scripts to change behavior. - Run the code in an isolated environment (virtualenv, container) if you do not fully trust the source, and review the included scripts (they are small and readable) before supplying credentials or running downloads. If you want, I can extract the exact places where audio is saved and where network calls occur, or help craft a safer installation command (virtualenv + pip) and show how to run API mode without persisting raw files.

Capability Analysis

Type: OpenClaw Skill Name: stt-recognizer Version: 1.0.8 The skill bundle provides legitimate Speech-to-Text (STT) functionality using the Whisper model (local or API-based). The scripts (record_audio.py, transcribe.py, and record_and_transcribe.py) perform their stated functions using standard libraries like PyAudio and faster-whisper, with no evidence of data exfiltration, malicious execution, or prompt injection.

Capability Tags

requires-sensitive-credentials

Capability Assessment

✓ Purpose & Capability

Name/description describe an STT tool. Included scripts (record_audio, transcribe, download_models, record_and_transcribe) and requirements (faster-whisper/whisper/openai, audio libraries, torch) are consistent with local transcription and optional API-based transcription.

ℹ Instruction Scope

SKILL.md and scripts instruct recording from the microphone, saving recordings under the workspace, downloading Whisper models into ~/.cache/huggingface/modules/stt-recognizer, and optionally sending audio to an OpenAI-compatible API when the user provides STT_API_URL/STT_API_KEY. These behaviors are expected for an STT skill, but note that enabling API mode transmits audio externally and the quick-start uses a system-wide pip install flag (--break-system-packages) which may modify system packages.

ℹ Install Mechanism

There is no packaged installer; the skill is instruction- and script-based. The provided download_models.sh calls faster_whisper.download_model to fetch model weights (expected behavior). This will download large model files (hundreds of MB to >1GB) into the user's cache directory and write them to disk — expected but resource-intensive. No suspicious external shorteners or unknown install URLs are used.

✓ Credentials

No required credentials are declared in registry metadata. The skill documents optional environment variables (OPENCLAW_WORKSPACE, STT_MODEL_PATH, STT_API_URL, STT_API_KEY) that are reasonable for an STT tool. Requesting an API key only makes sense when the user opts into API mode; there are no unrelated secret requests.

✓ Persistence & Privilege

always is false and the skill does not request elevated or global agent privileges. It writes models and outputs to user-local cache and workspace directories (normal for ML workloads) and does not modify other skills or system-wide agent config.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install stt-recognizer
After installation, invoke the skill by name or use /stt-recognizer
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.8

1. Fix all internal path references and display names to match the renamed stt-recognizer slug. 修复所有内部路径引用和展示名，统一为新命名的 stt-recognizer。

v1.0.7

1. Rename the skill from speech-transcriber to stt-recognizer and update the bilingual display name. 将技能从 speech-transcriber 重命名为 stt-recognizer，并更新中英文双语展示名。

v1.0.6

Fix repeated model downloads: add correct cache path to model search paths and fix find_model() to locate directories instead of files. 修复重复下载：添加正确的模型搜索路径，修复 find_model() 定位模型目录而非文件。

v1.0.5

Fix output paths from stt/ to speech-transcriber/ to match documentation

v1.0.4

1. Restructure directories: project and model cache now use skill-name convention. 优化目录结构，项目目录和模型缓存统一使用技能同名规范。 2. Update SKILL.md: paths unified to ~/.cache/huggingface/modules/speech-transcriber/. 更新 SKILL.md，路径统一为缓存位置。 3. Download script updated: default to small model. 下载脚本更新，默认下载 small 模型。 4. Clean up residual files and models directories. 清理残留旧文件和目录。

v1.0.3

Fix display name to Speech Transcriber | 语音转录器. Remove broken symlink and fix shell=True security issue. 修复显示名称；移除坏符号链接；修复安全问题。

v1.0.2

Remove broken symlink models/model.bin that caused embedding failures. Fix subprocess.run(shell=True) in record_audio.py for security compliance. 移除导致Embedding失败的坏符号链接；修复 subprocess.run(shell=True) 安全问题。

v1.0.1

Fix subprocess.run(shell=True) in record_audio.py for security compliance. 修复 record_audio.py 中 subprocess.run(shell=True) 调用，提升安全性。

v1.0.0

Initial release: Local speech-to-text with Whisper/faster-whisper and OpenAI API support. 首发版本：支持本地 Whisper 和 API 语音转文字。

Metadata

Slug stt-recognizer

Version 1.0.8

License MIT-0

All-time Installs 1

Active Installs 1

Total Versions 9

Frequently Asked Questions

What is STT Recognizer | STT 识别器?

语音转文字（Speech-to-Text / STT）工具。支持从麦克风录音，使用 Whisper（faster-whisper）在本地进行语音转文字，或通过 OpenAI 兼容 API 进行云端转写。触发词：录音、语音转文字、STT、语音识别、转写、录音转文字。适用平台：Linux / Windows... It is an AI Agent Skill for Claude Code / OpenClaw, with 218 downloads so far.

How do I install STT Recognizer | STT 识别器?

Run "/install stt-recognizer" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is STT Recognizer | STT 识别器 free?

Yes, STT Recognizer | STT 识别器 is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does STT Recognizer | STT 识别器 support?

STT Recognizer | STT 识别器 is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created STT Recognizer | STT 识别器?

It is built and maintained by Morois (@moroiser); the current version is v1.0.8.

More Skills