← 返回 Skills 市场

speech-recognition

Name: speech-recognition
Author: demo112

作者 demo112 · GitHub ↗ · v1.0.1

cross-platform ✓ 安全检测通过

5469

总下载

当前安装

版本数

在 OpenClaw 中安装

/install speech-recognition

功能描述

通用语音识别 Skill。支持多种音频格式（ogg/mp3/wav/m4a），使用硅基流动 SenseVoice API 进行语音转文字。当用户发送语音消息、音频文件，或需要转录音频时触发。

安全使用建议

Install this only if you want SiliconFlow-based transcription. Use a dedicated or revocable API key, and only transcribe audio you are comfortable sending to SiliconFlow, especially voice messages or meeting recordings.

功能分析

Type: OpenClaw Skill Name: speech-recognition Version: 1.0.1 The skill is designed for general speech recognition using the SiliconFlow SenseVoice API. All code snippets and instructions in SKILL.md, including `ffmpeg` commands for audio conversion and Python scripts for API calls, are directly related to this stated purpose. API keys are handled via standard OpenClaw configuration or environment variables. There is no evidence of prompt injection, unauthorized data exfiltration, malicious execution, persistence mechanisms, or other harmful intent. The disclosure that audio is uploaded to a third-party server is transparent and expected for this type of service.

能力评估

ℹ Purpose & Capability

The stated purpose is general speech recognition with SiliconFlow SenseVoice, and the documented ffmpeg conversion plus transcription API calls directly match that purpose.

ℹ Instruction Scope

The activation language covers voice messages, audio files, and audio transcription requests broadly; this is purpose-aligned, but agents should confirm intent before processing ambiguous attachments.

✓ Install Mechanism

The artifact contains only SKILL.md and skill.json, with examples and configuration guidance; there are no bundled executables, install hooks, or automatic setup scripts.

ℹ Credentials

Uploading audio to api.siliconflow.cn is necessary for the advertised transcription function, and the skill explicitly notes that audio is uploaded to SiliconFlow servers.

ℹ Persistence & Privilege

No persistence, background worker, privilege escalation, or destructive behavior is shown; it does require a SiliconFlow API key from OpenClaw config or an environment variable.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install speech-recognition
安装完成后，直接呼叫该 Skill 的名称或使用 /speech-recognition 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.1

- 更新语音消息处理 bash 示例，API Key 现在通过环境变量 SILICONFLOW_API_KEY 读取，提升安全性 - 移除流程中硬编码 API Key - 其他内容未改动

v1.0.0

- Initial release of the speech-recognition skill. - Provides general-purpose speech-to-text using the SenseVoice API. - Supports multiple audio formats: ogg, mp3, wav, m4a, flac. - Can be triggered by user voice messages or audio transcription requests. - Includes API configuration instructions and format conversion guidance. - Error handling and related skills documented for easy integration.

元数据

Slug speech-recognition

版本 1.0.1

许可证 —

累计安装 206

当前安装数 26

历史版本数 2

常见问题

speech-recognition 是什么？

通用语音识别 Skill。支持多种音频格式（ogg/mp3/wav/m4a），使用硅基流动 SenseVoice API 进行语音转文字。当用户发送语音消息、音频文件，或需要转录音频时触发。它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 5469 次。

如何安装 speech-recognition？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install speech-recognition」即可一键安装，无需额外配置。

speech-recognition 是免费的吗？

是的，speech-recognition 完全免费（开源免费），可自由下载、安装和使用。

speech-recognition 支持哪些平台？

speech-recognition 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 speech-recognition？

由 demo112（@demo112）开发并维护，当前版本 v1.0.1。