← Back to Skills Marketplace

qwen-audio-lab

Name: qwen-audio-lab
Author: aliyx

by aliyx · GitHub ↗ · v0.0.1 · MIT-0

cross-platform ⚠ suspicious

221

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install qwen-audio-lab

Description

Hybrid text-to-speech, reusable voice cloning, and narrated audio generation for macOS plus Aliyun Qwen. Use when the user wants to convert text into speech,...

Usage Guidance

What to consider before installing: - The skill does what it claims (local macOS 'say' + remote Qwen/DashScope TTS and voice-clone). However, the package metadata did NOT declare the required DASHSCOPE_API_KEY even though SKILL.md and the script require it — treat that as a red flag (metadata should match runtime requirements). - The script will make network calls to DashScope endpoints (https://dashscope.aliyuncs.com and https://dashscope-intl.aliyuncs.com). Only provide an API key if you trust the endpoint and the skill source. - The skill stores outputs and remembered-voice state under ~/.openclaw/data/qwen-audio-lab; verify you are comfortable with that directory being created/written. - For some operations (audio trimming) ffmpeg is required, and local playback uses macOS 'say' — these are normal but will invoke subprocesses. - Voice cloning can have legal/consent implications. The SKILL.md recommends asking for permission; you should enforce that policy yourself before cloning third-party voices. - Because the skill source is 'unknown' and the registry metadata is inconsistent, prefer to inspect the full script locally (ensure the truncated portion contains only TTS/manage-voice logic) or obtain the skill from a trusted publisher before supplying credentials. If you proceed, limit the scope/permissions of the API key (if possible) and monitor network activity.

Capability Analysis

Type: OpenClaw Skill Name: qwen-audio-lab Version: 0.0.1 The skill bundle provides a legitimate interface for Aliyun Qwen's text-to-speech and voice cloning services, including macOS local speech integration. The script `scripts/qwen_audio.py` uses standard Python libraries (urllib, subprocess, zipfile) to interact with the DashScope API and process audio files. It includes proper input sanitization for filenames and uses list-based subprocess calls to prevent shell injection. No evidence of data exfiltration, malicious execution, or prompt injection was found.

Capability Assessment

ℹ Purpose & Capability

The name/description (macOS + Aliyun Qwen TTS, voice cloning, narrated PPTs) matches what the code and SKILL.md implement: local 'say' playback, Qwen TTS calls, voice cloning/design endpoints, and local storage of outputs and remembered voices. However, the registry metadata lists no required environment variables or primary credential while both SKILL.md and the code require DASHSCOPE_API_KEY — this metadata omission is an incoherence to be aware of.

✓ Instruction Scope

The SKILL.md instructions and the included script remain focused on TTS/voice workflows. They reference only task-relevant files/paths (user home ~/.openclaw/data/qwen-audio-lab for outputs/state), optional ffmpeg for trimming, and network calls to DashScope (Aliyun) APIs. There is no instruction to read unrelated system files, shell history, or to exfiltrate arbitrary data.

✓ Install Mechanism

This is an instruction-only skill with an included Python script and no install spec; nothing is downloaded from external URLs during install. Runtime will execute local scripts and may call external network endpoints. No archive downloads or remote installers were specified.

⚠ Credentials

The code and SKILL.md require DASHSCOPE_API_KEY (plus optional QWEN_AUDIO_REGION, QWEN_AUDIO_OUTPUT_DIR, QWEN_AUDIO_STATE_DIR), but the registry metadata declared no required env vars or primary credential. This mismatch is concerning because the skill needs an API key to access remote TTS/voice-cloning services; the package should declare that requirement explicitly. Aside from the missing declaration, the environment access requested by the script (API key + optional dirs) is proportionate to the stated purpose.

✓ Persistence & Privilege

The skill does not request always:true and does not modify other skills or global configs. It writes state and outputs under ~/.openclaw/data/qwen-audio-lab (its own directory) which is normal for persistent skill state.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install qwen-audio-lab
After installation, invoke the skill by name or use /qwen-audio-lab
Provide required inputs per the skill's parameter spec and get structured output

Version History

v0.0.1

Initial release: Hybrid text-to-speech skill for macOS and Aliyun Qwen, with support for voice cloning and narrated file generation. - Provides text-to-speech via both local macOS and Aliyun Qwen backends. - Supports cloning and reusing voices from user-supplied audio samples. - Generates narration audio from plain text, text files, or PPT speaker notes. - Offers easy high-level commands for narration, as well as legacy commands for backward compatibility. - Adds environment variables for API keys, output directories, and state management.

Metadata

Slug qwen-audio-lab

Version 0.0.1

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is qwen-audio-lab?

Hybrid text-to-speech, reusable voice cloning, and narrated audio generation for macOS plus Aliyun Qwen. Use when the user wants to convert text into speech,... It is an AI Agent Skill for Claude Code / OpenClaw, with 221 downloads so far.

How do I install qwen-audio-lab?

Run "/install qwen-audio-lab" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is qwen-audio-lab free?

Yes, qwen-audio-lab is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does qwen-audio-lab support?

qwen-audio-lab is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created qwen-audio-lab?

It is built and maintained by aliyx (@aliyx); the current version is v0.0.1.

More Skills