← Back to Skills Marketplace
huanglizhuo

Qwen ASR

by lizhuo · GitHub ↗ · v0.1.2 · MIT-0
darwinlinux ⚠ suspicious
292
Downloads
0
Stars
0
Active Installs
3
Versions
Install in OpenClaw
/install qwen-asr-local
Description
Local speech-to-text using Qwen3-ASR (CPU-only, no API key, no cloud). Use when: (1) a voice message or audio file needs transcription, (2) user asks to tran...
Usage Guidance
This skill appears consistent with its stated purpose. Before installing: (1) review the install script and the GitHub release it downloads from and only run it if you trust that repo (prebuilt native binaries execute with your user privileges); (2) expect ~1.5 GB model download and network access to GitHub/HuggingFace; (3) ffmpeg is required for non‑WAV inputs; (4) install writes to ~/.local/bin and ~/.openclaw/tools/qwen-asr — you may need to add ~/.local/bin to your PATH; (5) if you need fully air-gapped/local operation, verify the model is cached locally or that the model download will not require a HuggingFace token. Otherwise the skill is internally coherent and proportionate.
Capability Analysis
Type: OpenClaw Skill Name: qwen-asr-local Version: 0.1.2 The skill provides local speech-to-text by downloading a pre-compiled binary from a third-party GitHub repository (huanglizhuo/QwenASR) and a 1.5GB model from HuggingFace. While these actions are aligned with the stated purpose of the tool, downloading and executing unverified binaries from external sources is a high-risk capability. The scripts (install.sh and transcribe.sh) are functionally sound and do not show signs of intentional malice, data exfiltration, or prompt injection.
Capability Assessment
Purpose & Capability
Name/description (local Qwen3-ASR CPU transcription) match the declared requirement (qwen-asr binary) and the included scripts. No unrelated environment variables, credentials, or unusual binaries are requested.
Instruction Scope
SKILL.md and transcribe.sh only describe running the qwen-asr binary (locally) and converting audio via ffmpeg when needed. The install script downloads a release and the model and writes them under the user's home; scripts do not read unrelated system files or exfiltrate data. The only optional env var is QWEN_ASR_MODEL_DIR to override the model path.
Install Mechanism
Install script fetches a prebuilt release from the project's GitHub Releases and extracts it to ~/.local/bin; model download is performed by qwen-asr (presumably fetching from HuggingFace). Using GitHub Releases and the model download command is expected for this purpose; no obscure/shortened URLs or third-party personal servers are used.
Credentials
No secrets or extra environment variables are required. The only environment interaction is an optional QWEN_ASR_MODEL_DIR and use of PATH/ffmpeg. Note: model download requires network access and some HuggingFace-hosted models may require authentication in other contexts, but no credential is requested by this skill.
Persistence & Privilege
Skill is not forcible (always:false) and does not modify other skills or system-wide agent settings. It writes binaries/models into the user's home directories (standard for local tools) but does not request elevated privileges.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install qwen-asr-local
  3. After installation, invoke the skill by name or use /qwen-asr-local
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v0.1.2
- Added source code and original implementation links for QwenASR and qwen-asr in documentation. - No changes to functionality or usage.
v0.1.1
- Added support for automatic audio format conversion via ffmpeg when using the transcribe.sh script. - Now accepts a wider range of audio formats (wav, mp3, m4a, ogg, flac, opus, webm, aac, etc.) for transcription. - Clarified that direct qwen-asr command works with WAV files only. - Updated usage instructions to reflect these changes.
v0.1.0
Initial release of qwen-asr-local: local, offline speech-to-text using Qwen3-ASR. - Runs entirely on CPU with no API key or cloud required. - Supports transcription of audio files, voice messages, and input from stdin. - Provides segmented and real-time streaming transcription modes. - Compatible with macOS and Linux only. - Simple installation via provided script; downloads pre-built binary and model files.
Metadata
Slug qwen-asr-local
Version 0.1.2
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 3
Frequently Asked Questions

What is Qwen ASR?

Local speech-to-text using Qwen3-ASR (CPU-only, no API key, no cloud). Use when: (1) a voice message or audio file needs transcription, (2) user asks to tran... It is an AI Agent Skill for Claude Code / OpenClaw, with 292 downloads so far.

How do I install Qwen ASR?

Run "/install qwen-asr-local" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Qwen ASR free?

Yes, Qwen ASR is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Qwen ASR support?

Qwen ASR is cross-platform and runs anywhere OpenClaw / Claude Code is available (darwin, linux).

Who created Qwen ASR?

It is built and maintained by lizhuo (@huanglizhuo); the current version is v0.1.2.

💬 Comments