← Back to Skills Marketplace

Qwen3-TTS VoiceDesign

Name: Qwen3-TTS VoiceDesign
Author: xiaoyaner0201

by xiaoyaner0201 · GitHub ↗ · v1.0.0

cross-platform ⚠ suspicious

653

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install qwen3-tts-voicedesign

Description

Text-to-speech with Qwen3-TTS VoiceDesign. Design custom voices via natural language descriptions + seed-based timbre fixation. Includes OpenAI-compatible AP...

Usage Guidance

This package appears to do what it says: set up a local TTS server, download a voice model, and provide client scripts. Before installing: 1) Expect large downloads (~3.5GB) and pip installing many packages (including torch/CUDA) — run in a controlled environment or VM/container if you don't want changes to your main system. 2) The server clears HTTP(S)_PROXY env vars at startup — if you are on a corporate network that requires a proxy for outbound connections, that may change routing; run behind a firewall or bind the server to 127.0.0.1 (TTS_HOST) if you only need local access. 3) The setup and server will download model data from ModelScope/HuggingFace and install packages from PyPI — verify you trust those sources and the specified model repo. 4) The client shell scripts construct JSON via simple interpolation — avoid passing untrusted/unsanitized text that could break the shell invocation. 5) If you plan to expose the server beyond localhost, secure it (firewall, reverse proxy, auth) because it exposes an HTTP API. If you want more assurance, run setup in an isolated container, inspect the pip-installed packages and the model repo, and avoid enabling systemd/scheduled-task instructions unless you understand the implications.

Capability Analysis

Type: OpenClaw Skill Name: qwen3-tts-voicedesign Version: 1.0.0 The skill bundle is classified as suspicious due to a critical shell injection vulnerability in `scripts/batch_seeds.sh` where the `TEXT` variable is unsafely interpolated into a `curl -d` argument, allowing arbitrary command execution if user input contains shell metacharacters. Additionally, the `SKILL.md` documentation instructs users to set up a Windows scheduled task with `highest privileges` for server auto-restart, which is a significant security risk and persistence mechanism. The `tts_server.py` also defaults to binding on `0.0.0.0`, exposing the service to the network by default.

Capability Assessment

✓ Purpose & Capability

Name/description (Qwen3-TTS VoiceDesign TTS server + client tools) matches the included scripts: a FastAPI server, client helpers, setup script and seed-batching tooling. The declared behavior (model download, one-click setup, OpenAI-compatible API) is consistent with the code.

ℹ Instruction Scope

SKILL.md instructs running setup.sh which creates a venv, pip-installs dependencies, downloads the model (ModelScope or Hugging Face), and runs the server; the runtime scripts only reference their .env and local files. Notable scope items: the server code clears proxy environment variables at start (potentially bypassing a corporate proxy), and the docs show guidance to register scheduled tasks or systemd units (these are only instructions, not executed automatically). The client scripts build JSON bodies via shell interpolation (potential for malformed input/escaping issues if used with untrusted text).

ℹ Install Mechanism

There is no platform install spec, but setup.sh will pip-install packages (qwen-tts, soundfile, pydub, uvicorn, fastapi, numpy and possibly modelscope and torch from the official PyTorch index). It downloads the ~3.5GB model via ModelScope or Hugging Face. These are expected for a local TTS runtime but do involve network access and large binary downloads; the sources used (ModelScope/HuggingFace, PyTorch wheel index) are standard release hosts rather than arbitrary shorteners.

✓ Credentials

The skill requests no credentials and exposes only environment variables relevant to running a local TTS server (seed, instruct, model path, host/port, format). The only surprising behavior is that the server explicitly clears HTTP(S) proxy environment variables at startup, which may affect network routing on hosts that rely on proxies; this is operational (not credential) behavior and not an attempt to read secrets.

✓ Persistence & Privilege

The skill is not always-enabled and does not attempt to change other skills' config. setup.sh suggests how to create systemd units or a Windows scheduled task, but it does not automatically create system-level services or elevate privileges. You must run setup/start manually, so persistence is user-controlled.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install qwen3-tts-voicedesign
After installation, invoke the skill by name or use /qwen3-tts-voicedesign
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

Initial release: VoiceDesign voice design via natural language + seed fixation, OpenAI-compatible API server, one-click setup, batch seed exploration

Metadata

Slug qwen3-tts-voicedesign

Version 1.0.0

License —

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Qwen3-TTS VoiceDesign?

Text-to-speech with Qwen3-TTS VoiceDesign. Design custom voices via natural language descriptions + seed-based timbre fixation. Includes OpenAI-compatible AP... It is an AI Agent Skill for Claude Code / OpenClaw, with 653 downloads so far.

How do I install Qwen3-TTS VoiceDesign?

Run "/install qwen3-tts-voicedesign" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Qwen3-TTS VoiceDesign free?

Yes, Qwen3-TTS VoiceDesign is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Qwen3-TTS VoiceDesign support?

Qwen3-TTS VoiceDesign is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Qwen3-TTS VoiceDesign?

It is built and maintained by xiaoyaner0201 (@xiaoyaner0201); the current version is v1.0.0.

More Skills