← Back to Skills Marketplace
yszheda

Qwen Asr Skill

by Shuai YUAN · GitHub ↗ · v1.3.0 · MIT-0
cross-platform ✓ Security Clean
355
Downloads
0
Stars
2
Active Installs
5
Versions
Install in OpenClaw
/install qwen-asr-skill
Description
Provides high-accuracy speech-to-text conversion supporting 22 Chinese dialects and 30 languages with automatic language detection, running on CPU.
Usage Guidance
This skill appears coherent with its stated purpose, but review and precautions are recommended before installing: 1) The skill will download ~6GB of model weights from Hugging Face on first run — ensure you have disk, CPU, and bandwidth capacity. 2) It runs a local web server that accepts file uploads; uploaded audio is stored temporarily in an uploads/ directory and then deleted — run in an isolated environment or container if you are cautious. 3) The code uses dotenv and standard environment variables — check any local .env before starting to avoid unintentionally exposing secrets or changing behavior (and be aware that a private HF model would require an HF token not declared in the SKILL.md). 4) The repository origin is listed as a placeholder in SKILL.md; verify the source URL and review the code before deployment. If you want higher assurance, run the skill inside a sandbox (container/VM) and inspect network traffic during first model download.
Capability Analysis
Type: OpenClaw Skill Name: qwen-asr-skill Version: 1.3.0 The skill bundle is a legitimate implementation of an Automated Speech Recognition (ASR) service using the Qwen3-ASR-0.6B model. The code consists of an Express.js server (index.js) that interfaces with a Python inference script (asr.py) via the python-shell library. It includes standard features such as file upload handling via multer, CPU performance optimizations (cpu-optimization.py), and dialect mapping (dialect-map.js). No evidence of data exfiltration, malicious execution, backdoors, or prompt injection was found; the logic is consistent with the stated purpose of providing local speech-to-text capabilities.
Capability Assessment
Purpose & Capability
Name/description (Qwen ASR, CPU-side dialect support) matches the code: a Node.js HTTP wrapper that invokes a Python asr.py using qwen-asr and torch. Declared package dependencies and Python requirements align with running a local ASR model. No unrelated cloud credentials or surprising binaries are requested.
Instruction Scope
SKILL.md instructions (git clone, npm install, pip install, npm start) map to the provided Node + Python code. The skill runs a local web server, accepts uploaded audio or base64, invokes the Python script, and deletes uploaded files. The SKILL.md claims model weights are downloaded from Hugging Face on first run — code uses from_pretrained which will perform that download. There is no instruction to read unrelated user files, but the code does load environment variables (via dotenv) if present.
Install Mechanism
There is no automatic download/install spec in the registry; the README asks the user to run npm and pip installs. Model weights are retrieved at runtime via Hugging Face from_pretrained (a common, known source). No obscure download URLs or archive extraction from untrusted hosts were found in the code.
Credentials
The skill declares no required env vars or credentials. It reads standard process.env values (MODEL_NAME, DEVICE, CACHE_DIR, PYTHON_PATH, etc.) and uses dotenv if a .env file exists — reasonable for configuration but means a local .env could alter behavior. If the chosen model is private, Hugging Face authentication (HF_TOKEN) might be needed even though it isn't declared. No requests for unrelated service credentials were found.
Persistence & Privilege
always:false and user-invocable:true. The skill runs as a local web service and does not request persistent platform-wide privileges or modify other skills. It executes a local Python script (asr.py) via python-shell, which is expected for this design.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install qwen-asr-skill
  3. After installation, invoke the skill by name or use /qwen-asr-skill
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.3.0
v1.3.0: 极简版发布 - 仅0.6B模型,无强制对齐功能,减少内存占用和依赖
v1.2.1
修复依赖版本:qwen-asr 版本从 0.1.0 改为 0.0.6(PyPI 上的最新版本)
v1.2.0
v1.2.0: 修复代码错误 - 移除重复代码、修正逻辑错误、完善隐私声明
v1.1.0
v1.1.0: 修复安全问题 - 移除硬编码凭证
v1.0.0
Initial release of Qwen 方言语音识别: - Provides speech-to-text conversion using the Qwen3-ASR-0.6B model. - Supports 22 major Chinese dialects and 30 international languages. - Runs on CPU without GPU requirements. - Features automatic language detection and low-latency performance. - Offers API endpoint for audio transcription with support for language selection and timestamps.
Metadata
Slug qwen-asr-skill
Version 1.3.0
License MIT-0
All-time Installs 2
Active Installs 2
Total Versions 5
Frequently Asked Questions

What is Qwen Asr Skill?

Provides high-accuracy speech-to-text conversion supporting 22 Chinese dialects and 30 languages with automatic language detection, running on CPU. It is an AI Agent Skill for Claude Code / OpenClaw, with 355 downloads so far.

How do I install Qwen Asr Skill?

Run "/install qwen-asr-skill" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Qwen Asr Skill free?

Yes, Qwen Asr Skill is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Qwen Asr Skill support?

Qwen Asr Skill is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Qwen Asr Skill?

It is built and maintained by Shuai YUAN (@yszheda); the current version is v1.3.0.

💬 Comments