← Back to Skills Marketplace
jixsonwang

Aliyun Asr

by Jixson · GitHub ↗ · v1.0.10
cross-platform ⚠ suspicious
2234
Downloads
2
Stars
7
Active Installs
10
Versions
Install in OpenClaw
/install aliyun-asr
Description
Pure Aliyun ASR skill for voice message transcription, supports multiple channels including Feishu
README (SKILL.md)

阿里云语音识别 (Aliyun ASR) 技能

纯语音识别,无语音合成 - 这是一个专门为OpenClaw设计的轻量级阿里云语音识别技能,只做一件事:将语音消息转换为文本。

🎯 核心功能

  • ✅ 纯ASR识别: 只进行语音到文本的转换,不生成任何语音回复
  • ✅ 多通道支持: 支持飞书(Feishu)、Telegram、WhatsApp等所有OpenClaw支持的语音消息通道
  • ✅ 自动集成: 无需额外配置,语音消息自动被识别并作为文本消息处理

⚙️ 快速配置

1. 阿里云准备

  • 开通 智能语音交互(NLS) 服务
  • 在RAM控制台创建子用户并分配 AliyunNLSFullAccess 权限
  • 在NLS控制台创建应用,获取 AppKey

2. 配置文件

创建配置文件 /root/.openclaw/aliyun-asr-config.json:

{
  "access_key_id": "your-access-key-id",
  "access_key_secret": "your-access-key-secret",
  "app_key": "your-app-key",
  "region": "cn-shanghai"
}

3. 安全设置

chmod 600 /root/.openclaw/aliyun-asr-config.json

🚀 使用方法

自动模式(推荐)

  1. 用户向任何支持的通道发送语音消息
  2. OpenClaw自动调用此技能识别语音内容
  3. 识别的文本作为用户消息传递给AI
  4. AI生成纯文本回复(不是语音)

🔧 技术细节

  • 依赖: requests (Python包)
  • 支持格式: MP3, WAV, OGG, FLAC, AMR, OPUS
  • API区域: 默认 cn-shanghai(可配置)

🛡️ 安全与合规

  • 无数据存储: 语音数据不存储在本地
  • 最小权限: 使用RAM子账号,避免主账号密钥
  • 配置分离: 敏感信息与代码完全分离

💡 开发规范

此技能严格遵循以下开发准则:

  1. ✅ 完全符合开源skills的配置要求
  2. ✅ 完全符合当地的法律法规要求
  3. ✅ 未开发或未实现的功能,不包含在源码中
  4. ✅ 本地测试代码,测试用例不包含在源码中
  5. ✅ 密钥/认证隐私信息,不包含在源代码中
Usage Guidance
This skill's code implements Aliyun ASR and calls official Aliyun endpoints, but there are important inconsistencies you should consider before installing: - The README's "no extra configuration" claim is false: you must create /root/.openclaw/aliyun-asr-config.json containing your Aliyun access_key_id/access_key_secret and app_key. The registry metadata did not declare this config path. Confirm you are comfortable storing credentials on disk at that location and that the agent process has permission to read it. - The code invokes ffmpeg for OGG→WAV conversion but ffmpeg is not listed as a required binary. Ensure ffmpeg is available and that calling subprocesses is acceptable in your environment. - The skill posts raw audio bytes to Aliyun NLS endpoints (expected for ASR). There are no hidden external endpoints in the code, which is good, but review the code yourself if you don't fully trust the author. - Prefer creating a least-privilege RAM subuser as recommended, and set strict file permissions (chmod 600) on the config file. Consider running the agent under a non-root account and placing the config in a non-root path — or update the code to allow a configurable config path. If you need this functionality and are comfortable with the above, the implementation is plausible. If you cannot or will not store cloud credentials on disk at /root or cannot allow subprocess calls, do not install. If uncertain, request the author to (1) declare the config path in metadata, (2) allow config path override via env var, and (3) declare ffmpeg as a required binary.
Capability Analysis
Type: OpenClaw Skill Name: aliyun-asr Version: 1.0.10 The skill is classified as suspicious due to the use of `subprocess.run()` to execute `ffmpeg` in `aliyun_pure_asr.py`. While intended for legitimate audio format conversion, passing a potentially user-controlled `audio_file` path to an external command introduces a shell injection vulnerability risk, even when arguments are provided as a list. This capability, if exploited, could lead to arbitrary command execution within the OpenClaw agent's environment. There is no evidence of intentional malicious behavior like data exfiltration to unauthorized endpoints or prompt injection attempts in SKILL.md.
Capability Assessment
Purpose & Capability
Name/description match the code: the Python code calls Aliyun NLS endpoints to convert audio to text. However, the metadata claimed no required config paths or credentials while the implementation requires a settings file at /root/.openclaw/aliyun-asr-config.json containing AccessKeyId/Secret and app_key. The use of ffmpeg for format conversion is present in code but not declared in required binaries. These gaps are inconsistent with the published metadata/README.
Instruction Scope
SKILL.md asserts "automatic integration, no additional configuration" and "no data storage," yet runtime instructions and code require creating a config file with credentials under /root/.openclaw and advise chmod 600. The code will read that file and exit if missing. The handler also invokes ffmpeg via subprocess to convert OGG→WAV, and posts raw audio bytes to Aliyun endpoints. The README's automatic/zero-config claim is therefore misleading and grants the skill implicit access to a sensitive on-disk config path.
Install Mechanism
No install spec (instruction-only installer) — lower risk because nothing is auto-downloaded. The package includes Python code and declares dependency on the requests Python package in the README. However, ffmpeg is invoked at runtime but not listed as a required binary. There is also an empty index.js/package.json present (benign but unnecessary).
Credentials
The skill does not request environment variables but requires permanent credentials stored in a local JSON config file (access_key_id and access_key_secret). Those credentials are appropriate for calling Aliyun ASR, but storing them in /root/.openclaw implies the skill expects root-level file access. The number/type of secrets (Aliyun keys) is proportionate to the stated purpose, but the mismatch between declared/actual config requirements and use of a root path is concerning.
Persistence & Privilege
Skill does not request always:true and does not modify other skills or system-wide settings. It runs as an on-demand handler and prints or returns recognized text. No indications of privileged persistence beyond reading the expected config file.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install aliyun-asr
  3. After installation, invoke the skill by name or use /aliyun-asr
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.10
安全修复:移除潜在敏感文件,添加index.js,清理缓存目录,确保无密钥泄露风险
v1.0.9
- Updated skill metadata: changed "name" field from "Aliyun ASR" to "aliyun-asr" for consistency. - No functional changes; documentation tweaks and formatting only.
v1.0.8
安全修复:移除硬编码凭证,更新配置
v1.0.7
安全修复:移除所有调试文件和硬编码凭证,确保符合安全规范
v1.0.6
修复ASR识别问题:添加音频格式转换和参数优化,支持飞书OGG/Opus语音消息
v1.0.5
Security fix: Removed accidental inclusion of clawhub-config.json with sensitive token
v1.0.4
Updated skill name to English 'Aliyun ASR' and cleaned up directory
v1.0.3
改进SKILL.md描述,添加详细配置指南和使用说明
v1.0.2
Improved documentation with detailed usage instructions and configuration guide
v1.0.0
初始版本:纯ASR语音识别技能,支持飞书等多通道语音消息处理
Metadata
Slug aliyun-asr
Version 1.0.10
License
All-time Installs 10
Active Installs 7
Total Versions 10
Frequently Asked Questions

What is Aliyun Asr?

Pure Aliyun ASR skill for voice message transcription, supports multiple channels including Feishu. It is an AI Agent Skill for Claude Code / OpenClaw, with 2234 downloads so far.

How do I install Aliyun Asr?

Run "/install aliyun-asr" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Aliyun Asr free?

Yes, Aliyun Asr is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Aliyun Asr support?

Aliyun Asr is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Aliyun Asr?

It is built and maintained by Jixson (@jixsonwang); the current version is v1.0.10.

💬 Comments