← Back to Skills Marketplace

speech-recognition

Name: speech-recognition
Author: demo112

by demo112 · GitHub ↗ · v1.0.1

cross-platform ✓ Security Clean

5469

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install speech-recognition

Description

通用语音识别 Skill。支持多种音频格式（ogg/mp3/wav/m4a），使用硅基流动 SenseVoice API 进行语音转文字。当用户发送语音消息、音频文件，或需要转录音频时触发。

Usage Guidance

Install this only if you want SiliconFlow-based transcription. Use a dedicated or revocable API key, and only transcribe audio you are comfortable sending to SiliconFlow, especially voice messages or meeting recordings.

Capability Analysis

Type: OpenClaw Skill Name: speech-recognition Version: 1.0.1 The skill is designed for general speech recognition using the SiliconFlow SenseVoice API. All code snippets and instructions in SKILL.md, including `ffmpeg` commands for audio conversion and Python scripts for API calls, are directly related to this stated purpose. API keys are handled via standard OpenClaw configuration or environment variables. There is no evidence of prompt injection, unauthorized data exfiltration, malicious execution, persistence mechanisms, or other harmful intent. The disclosure that audio is uploaded to a third-party server is transparent and expected for this type of service.

Capability Assessment

ℹ Purpose & Capability

The stated purpose is general speech recognition with SiliconFlow SenseVoice, and the documented ffmpeg conversion plus transcription API calls directly match that purpose.

ℹ Instruction Scope

The activation language covers voice messages, audio files, and audio transcription requests broadly; this is purpose-aligned, but agents should confirm intent before processing ambiguous attachments.

✓ Install Mechanism

The artifact contains only SKILL.md and skill.json, with examples and configuration guidance; there are no bundled executables, install hooks, or automatic setup scripts.

ℹ Credentials

Uploading audio to api.siliconflow.cn is necessary for the advertised transcription function, and the skill explicitly notes that audio is uploaded to SiliconFlow servers.

ℹ Persistence & Privilege

No persistence, background worker, privilege escalation, or destructive behavior is shown; it does require a SiliconFlow API key from OpenClaw config or an environment variable.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install speech-recognition
After installation, invoke the skill by name or use /speech-recognition
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.1

- 更新语音消息处理 bash 示例，API Key 现在通过环境变量 SILICONFLOW_API_KEY 读取，提升安全性 - 移除流程中硬编码 API Key - 其他内容未改动

v1.0.0

- Initial release of the speech-recognition skill. - Provides general-purpose speech-to-text using the SenseVoice API. - Supports multiple audio formats: ogg, mp3, wav, m4a, flac. - Can be triggered by user voice messages or audio transcription requests. - Includes API configuration instructions and format conversion guidance. - Error handling and related skills documented for easy integration.

Metadata

Slug speech-recognition

Version 1.0.1

License —

All-time Installs 206

Active Installs 26

Total Versions 2

Frequently Asked Questions

What is speech-recognition?

通用语音识别 Skill。支持多种音频格式（ogg/mp3/wav/m4a），使用硅基流动 SenseVoice API 进行语音转文字。当用户发送语音消息、音频文件，或需要转录音频时触发。 It is an AI Agent Skill for Claude Code / OpenClaw, with 5469 downloads so far.

How do I install speech-recognition?

Run "/install speech-recognition" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is speech-recognition free?

Yes, speech-recognition is completely free (open-source). You can download, install and use it at no cost.

Which platforms does speech-recognition support?

speech-recognition is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created speech-recognition?

It is built and maintained by demo112 (@demo112); the current version is v1.0.1.

More Skills