← Back to Skills Marketplace

Whisper Stt

Name: Whisper Stt
Author: qiaotucodes

by 魏然 · GitHub ↗ · v0.1.0

cross-platform ⚠ suspicious

441

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install openclaw-skill-whisper-stt

Description

语音转文字 - 使用OpenAI Whisper将音频文件识别为文字

Usage Guidance

This package is a simple, local Whisper-based transcription script — safe in the sense it only transcribes local files — but the documentation claims automatic processing and sending results to Feishu even though the code doesn't implement that. Before installing or enabling automated runs: 1) Confirm whether you want automatic network posting; if so, require the author to implement it and declare what env vars/tokens will be needed. 2) Expect large model downloads (1–10GB) and significant CPU/RAM or GPU needs; ensure sufficient disk space and bandwidth. 3) If running in a hosted agent, limit network/file permissions (run in isolated environment) until the Feishu/integration behavior is clarified. 4) Note the README/author metadata includes an obfuscated-looking identity string in the script — not proof of malice but worth verifying the source if you require provenance.

Capability Analysis

Type: OpenClaw Skill Name: openclaw-skill-whisper-stt Version: 0.1.0 The OpenClaw skill 'whisper-stt' is designed for speech-to-text conversion using OpenAI's Whisper model. The `SKILL.md` provides clear, non-malicious instructions for the AI agent to process audio and send the transcribed text to Feishu, which is a legitimate function for such a skill. The `transcribe.py` script correctly implements this functionality, handling audio input and text output without any signs of data exfiltration, malicious execution, persistence mechanisms, or prompt injection attempts. All dependencies and operations are standard for an STT tool.

Capability Assessment

⚠ Purpose & Capability

The README and SKILL.md describe automatic handling of incoming audio and sending transcriptions to Feishu, but the included code (transcribe.py) is a simple CLI that only loads Whisper and writes output locally. No Feishu integration, webhooks, or required credentials are declared — inconsistent claims vs. implementation.

⚠ Instruction Scope

SKILL.md recommends 'automatic processing' and explicitly says transcriptions are sent to Feishu. The runtime instructions do not document how the agent will receive audio, authenticate to Feishu, or where credentials would come from. The actual code only processes local files and does not reference external endpoints.

✓ Install Mechanism

No install spec is provided (instruction-only plus a small Python script). Dependencies are typical (openai-whisper, PyTorch, ffmpeg). Be aware model downloads (1–10GB) happen at first run and require network and disk space.

⚠ Credentials

The skill declares no required environment variables or credentials, yet SKILL.md describes sending results to Feishu (which would require tokens). The absence of declared env vars for any external service is inconsistent with the described automatic-sending behavior.

✓ Persistence & Privilege

The skill does not request always:true and does not attempt to modify other skills or system settings. It runs locally as a CLI script; autonomous invocation is allowed by default but not unusual here.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install openclaw-skill-whisper-stt
After installation, invoke the skill by name or use /openclaw-skill-whisper-stt
Provide required inputs per the skill's parameter spec and get structured output

Version History

v0.1.0

Initial release of whisper-stt skill: - Converts audio/voice files to text using OpenAI Whisper. - Automatically processes audio files sent by users and sends recognized text to Feishu. - Supports a wide range of audio formats, including MP3, WAV, M4A, OGG, FLAC, and WebM. - Offers multiple Whisper models for flexible trade-offs between speed, size, and accuracy. - Requires Python 3.8+, PyTorch, openai-whisper, and ffmpeg.

Metadata

Slug openclaw-skill-whisper-stt

Version 0.1.0

License —

All-time Installs 1

Active Installs 1

Total Versions 1

Frequently Asked Questions

What is Whisper Stt?

语音转文字 - 使用OpenAI Whisper将音频文件识别为文字. It is an AI Agent Skill for Claude Code / OpenClaw, with 441 downloads so far.

How do I install Whisper Stt?

Run "/install openclaw-skill-whisper-stt" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Whisper Stt free?

Yes, Whisper Stt is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Whisper Stt support?

Whisper Stt is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Whisper Stt?

It is built and maintained by 魏然 (@qiaotucodes); the current version is v0.1.0.

More Skills