← Back to Skills Marketplace
xiaoyaner0201

Qwen3-TTS VoiceDesign

by xiaoyaner0201 · GitHub ↗ · v1.0.0
cross-platform ⚠ suspicious
653
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install qwen3-tts-voicedesign
Description
Text-to-speech with Qwen3-TTS VoiceDesign. Design custom voices via natural language descriptions + seed-based timbre fixation. Includes OpenAI-compatible AP...
Usage Guidance
This package appears to do what it says: set up a local TTS server, download a voice model, and provide client scripts. Before installing: 1) Expect large downloads (~3.5GB) and pip installing many packages (including torch/CUDA) — run in a controlled environment or VM/container if you don't want changes to your main system. 2) The server clears HTTP(S)_PROXY env vars at startup — if you are on a corporate network that requires a proxy for outbound connections, that may change routing; run behind a firewall or bind the server to 127.0.0.1 (TTS_HOST) if you only need local access. 3) The setup and server will download model data from ModelScope/HuggingFace and install packages from PyPI — verify you trust those sources and the specified model repo. 4) The client shell scripts construct JSON via simple interpolation — avoid passing untrusted/unsanitized text that could break the shell invocation. 5) If you plan to expose the server beyond localhost, secure it (firewall, reverse proxy, auth) because it exposes an HTTP API. If you want more assurance, run setup in an isolated container, inspect the pip-installed packages and the model repo, and avoid enabling systemd/scheduled-task instructions unless you understand the implications.
Capability Analysis
Type: OpenClaw Skill Name: qwen3-tts-voicedesign Version: 1.0.0 The skill bundle is classified as suspicious due to a critical shell injection vulnerability in `scripts/batch_seeds.sh` where the `TEXT` variable is unsafely interpolated into a `curl -d` argument, allowing arbitrary command execution if user input contains shell metacharacters. Additionally, the `SKILL.md` documentation instructs users to set up a Windows scheduled task with `highest privileges` for server auto-restart, which is a significant security risk and persistence mechanism. The `tts_server.py` also defaults to binding on `0.0.0.0`, exposing the service to the network by default.
Capability Assessment
Purpose & Capability
Name/description (Qwen3-TTS VoiceDesign TTS server + client tools) matches the included scripts: a FastAPI server, client helpers, setup script and seed-batching tooling. The declared behavior (model download, one-click setup, OpenAI-compatible API) is consistent with the code.
Instruction Scope
SKILL.md instructs running setup.sh which creates a venv, pip-installs dependencies, downloads the model (ModelScope or Hugging Face), and runs the server; the runtime scripts only reference their .env and local files. Notable scope items: the server code clears proxy environment variables at start (potentially bypassing a corporate proxy), and the docs show guidance to register scheduled tasks or systemd units (these are only instructions, not executed automatically). The client scripts build JSON bodies via shell interpolation (potential for malformed input/escaping issues if used with untrusted text).
Install Mechanism
There is no platform install spec, but setup.sh will pip-install packages (qwen-tts, soundfile, pydub, uvicorn, fastapi, numpy and possibly modelscope and torch from the official PyTorch index). It downloads the ~3.5GB model via ModelScope or Hugging Face. These are expected for a local TTS runtime but do involve network access and large binary downloads; the sources used (ModelScope/HuggingFace, PyTorch wheel index) are standard release hosts rather than arbitrary shorteners.
Credentials
The skill requests no credentials and exposes only environment variables relevant to running a local TTS server (seed, instruct, model path, host/port, format). The only surprising behavior is that the server explicitly clears HTTP(S) proxy environment variables at startup, which may affect network routing on hosts that rely on proxies; this is operational (not credential) behavior and not an attempt to read secrets.
Persistence & Privilege
The skill is not always-enabled and does not attempt to change other skills' config. setup.sh suggests how to create systemd units or a Windows scheduled task, but it does not automatically create system-level services or elevate privileges. You must run setup/start manually, so persistence is user-controlled.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install qwen3-tts-voicedesign
  3. After installation, invoke the skill by name or use /qwen3-tts-voicedesign
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release: VoiceDesign voice design via natural language + seed fixation, OpenAI-compatible API server, one-click setup, batch seed exploration
Metadata
Slug qwen3-tts-voicedesign
Version 1.0.0
License
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Qwen3-TTS VoiceDesign?

Text-to-speech with Qwen3-TTS VoiceDesign. Design custom voices via natural language descriptions + seed-based timbre fixation. Includes OpenAI-compatible AP... It is an AI Agent Skill for Claude Code / OpenClaw, with 653 downloads so far.

How do I install Qwen3-TTS VoiceDesign?

Run "/install qwen3-tts-voicedesign" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Qwen3-TTS VoiceDesign free?

Yes, Qwen3-TTS VoiceDesign is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Qwen3-TTS VoiceDesign support?

Qwen3-TTS VoiceDesign is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Qwen3-TTS VoiceDesign?

It is built and maintained by xiaoyaner0201 (@xiaoyaner0201); the current version is v1.0.0.

💬 Comments