← Back to Skills Marketplace

Qwen Audio

Name: Qwen Audio
Author: darknoah

by noah · GitHub ↗ · v0.0.6

cross-platform ⚠ suspicious

432

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install qwen-audio

Description

High-performance audio library with text-to-speech (TTS) and speech-to-text (STT).

Usage Guidance

This skill implements TTS/STT and largely does what it says, but take these precautions before installing or letting an agent run it: - Run it in an isolated environment (VM/container) because it will download and install heavy ML packages and models (torch, qwen-tts/asr, etc.), which use significant disk, memory, and network. - Ensure you have the 'uv' CLI and Python 3.10+ available — the SKILL.md uses 'uv run' but the registry metadata does not list 'uv' as a required binary. - Expect network access to Hugging Face and other endpoints (the code probes HF_ENDPOINT and can download models). If you need to avoid external network traffic, do not install or run the skill. - The script may auto-install missing Python packages via os.system('uv add ...') — this is a legitimate convenience but increases runtime privilege and attack surface. Review the pyproject.toml and the packages it will pull before proceeding. - Voices and other files are stored under ./voices/ and the skill will write to the skill folder; consider filesystem permissions and where you run it. - No credentials are requested, but environment variables (QWEN_AUDIO_DEVICE, QWEN_AUDIO_DTYPE, HF_ENDPOINT) influence behavior; these are not declared in the metadata and should be documented or locked down. If you need lower risk, ask the author to (1) declare required binaries and env vars explicitly, (2) remove runtime auto-installs or make them opt-in, and (3) document model download endpoints and disk requirements. Review the full scripts/qwen-audio.py before granting the skill autonomous invocation.

Capability Analysis

Type: OpenClaw Skill Name: qwen-audio Version: 0.0.6 The skill bundle provides a legitimate implementation of audio processing capabilities (TTS, STT, and voice cloning) using Qwen-Audio models. The Python script `scripts/qwen-audio.py` acts as a wrapper for ML libraries, including a fallback mechanism to install dependencies via `os.system` and connectivity checks for Hugging Face. The `SKILL.md` and `env-check-list.md` files provide clear, functional instructions for the AI agent to manage the environment and interact with the user safely. No evidence of malicious intent, data exfiltration, or unauthorized execution was found.

Capability Assessment

ℹ Purpose & Capability

Name/description (TTS/STT) matches the included code and pyproject dependencies (qwen-asr, qwen-tts, mlx-audio, torch). However the SKILL.md and registry metadata claim no required binaries/env vars while the instructions and code rely on the 'uv' CLI, Python >=3.10, and may require network access to download large models. The overall capability is coherent with its stated purpose but some required runtime pieces are not declared in the metadata.

ℹ Instruction Scope

Runtime instructions tell the agent to run 'uv run ...' and to manipulate a local ./voices/ directory; the code will read and write these local voice files. Instructions require the user to run env-checks and to explicitly confirm voice selection before TTS, which limits accidental use. The SKILL.md does not explicitly warn that model downloads and package installs will occur, but the code will contact Hugging Face and other endpoints and can operate in online/offline modes.

⚠ Install Mechanism

There is no platform install spec (instruction-only), but the pyproject.toml lists heavy ML dependencies and a custom torch index. The script itself will run a shell command (os.system("uv add mlx-audio ...")) to install missing packages at runtime. Auto-install and model downloads introduce moderate risk (large network/disk operations and execution of runtime-installed packages).

ℹ Credentials

The skill declares no required environment variables, but the code reads/uses QWEN_AUDIO_DEVICE, QWEN_AUDIO_DTYPE, HF_ENDPOINT and may set HF_HUB_OFFLINE. No secret or credential env vars are requested. The mismatch between declared requirements and actual env usage reduces transparency and should be resolved before trusting the skill.

✓ Persistence & Privilege

always is false and the skill does not request system-wide config changes or other skills' credentials. It will write voice profiles under its own ./voices/ directory and may create/update files like references/env-check-list.md as instructed, which is normal for a local audio skill.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install qwen-audio
After installation, invoke the skill by name or use /qwen-audio
Provide required inputs per the skill's parameter spec and get structured output

Version History

v0.0.6

- No file changes detected for version 0.0.6. - No user-facing updates, feature additions, or documentation changes in this release. - Functionality and interface remain unchanged from the previous version.

v0.0.5

- Update to version 0.0.5 - Modified scripts/qwen-audio.py - No user-facing documentation or feature changes noted in SKILL.md

v0.0.4

- Added detailed documentation for voice management, including creating, listing, and using custom voice profiles. - Introduced clear prerequisites and environment check instructions. - Provided step-by-step guidance and JSON response examples for text-to-speech (TTS) and speech-to-text (STT) functionalities. - Explained the workflow for TTS voice selection and cloning, with emphasis on voice style and confirmation before generation. - Described new STT output format options and included a test audio link. - Improved clarity on usage and capabilities throughout the documentation.

Metadata

Slug qwen-audio

Version 0.0.6

License —

All-time Installs 1

Active Installs 1

Total Versions 3

Frequently Asked Questions

What is Qwen Audio?

High-performance audio library with text-to-speech (TTS) and speech-to-text (STT). It is an AI Agent Skill for Claude Code / OpenClaw, with 432 downloads so far.

How do I install Qwen Audio?

Run "/install qwen-audio" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Qwen Audio free?

Yes, Qwen Audio is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Qwen Audio support?

Qwen Audio is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Qwen Audio?

It is built and maintained by noah (@darknoah); the current version is v0.0.6.

More Skills