← Back to Skills Marketplace
5469
Downloads
3
Stars
26
Active Installs
2
Versions
Install in OpenClaw
/install speech-recognition
Description
通用语音识别 Skill。支持多种音频格式(ogg/mp3/wav/m4a),使用硅基流动 SenseVoice API 进行语音转文字。当用户发送语音消息、音频文件,或需要转录音频时触发。
Usage Guidance
Install this only if you want SiliconFlow-based transcription. Use a dedicated or revocable API key, and only transcribe audio you are comfortable sending to SiliconFlow, especially voice messages or meeting recordings.
Capability Analysis
Type: OpenClaw Skill
Name: speech-recognition
Version: 1.0.1
The skill is designed for general speech recognition using the SiliconFlow SenseVoice API. All code snippets and instructions in SKILL.md, including `ffmpeg` commands for audio conversion and Python scripts for API calls, are directly related to this stated purpose. API keys are handled via standard OpenClaw configuration or environment variables. There is no evidence of prompt injection, unauthorized data exfiltration, malicious execution, persistence mechanisms, or other harmful intent. The disclosure that audio is uploaded to a third-party server is transparent and expected for this type of service.
Capability Assessment
Purpose & Capability
The stated purpose is general speech recognition with SiliconFlow SenseVoice, and the documented ffmpeg conversion plus transcription API calls directly match that purpose.
Instruction Scope
The activation language covers voice messages, audio files, and audio transcription requests broadly; this is purpose-aligned, but agents should confirm intent before processing ambiguous attachments.
Install Mechanism
The artifact contains only SKILL.md and skill.json, with examples and configuration guidance; there are no bundled executables, install hooks, or automatic setup scripts.
Credentials
Uploading audio to api.siliconflow.cn is necessary for the advertised transcription function, and the skill explicitly notes that audio is uploaded to SiliconFlow servers.
Persistence & Privilege
No persistence, background worker, privilege escalation, or destructive behavior is shown; it does require a SiliconFlow API key from OpenClaw config or an environment variable.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install speech-recognition - After installation, invoke the skill by name or use
/speech-recognition - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.1
- 更新语音消息处理 bash 示例,API Key 现在通过环境变量 SILICONFLOW_API_KEY 读取,提升安全性
- 移除流程中硬编码 API Key
- 其他内容未改动
v1.0.0
- Initial release of the speech-recognition skill.
- Provides general-purpose speech-to-text using the SenseVoice API.
- Supports multiple audio formats: ogg, mp3, wav, m4a, flac.
- Can be triggered by user voice messages or audio transcription requests.
- Includes API configuration instructions and format conversion guidance.
- Error handling and related skills documented for easy integration.
Metadata
Frequently Asked Questions
What is speech-recognition?
通用语音识别 Skill。支持多种音频格式(ogg/mp3/wav/m4a),使用硅基流动 SenseVoice API 进行语音转文字。当用户发送语音消息、音频文件,或需要转录音频时触发。 It is an AI Agent Skill for Claude Code / OpenClaw, with 5469 downloads so far.
How do I install speech-recognition?
Run "/install speech-recognition" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is speech-recognition free?
Yes, speech-recognition is completely free (open-source). You can download, install and use it at no cost.
Which platforms does speech-recognition support?
speech-recognition is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created speech-recognition?
It is built and maintained by demo112 (@demo112); the current version is v1.0.1.
More Skills