← Back to Skills Marketplace
332
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install qwen3-tts-mlx
Description
Local Qwen3-TTS speech synthesis on Apple Silicon via MLX. Use for offline narration, audiobooks, video voiceovers, and multilingual TTS.
Usage Guidance
This package is coherent with a local TTS tool, but take these precautions before installing:
- Expect network activity and large downloads unless you pre-download models: the scripts refer to model identifiers (mlx-community/...), which are normally fetched from remote model hubs. If you truly need offline-only operation, confirm/model-downloads and pre-stage the weights locally.
- Review and pin the 'mlx-audio' package (and its dependencies) before pip install to reduce supply-chain risk. Consider installing in a virtualenv or sandbox.
- Be prepared for significant disk usage (models are ~1–2+ GB each) and possible memory/GPU requirements on your Mac.
- Voice cloning processes user audio; consider privacy and consent implications before processing other people's voice samples.
- The batch script monkeypatches transformers.AutoTokenizer to set a flag — this mutates third-party behavior in-process. It's likely harmless but you may want to review or remove that patch if you prefer not to change library internals.
- If you need stronger assurance, inspect or run the code in an isolated environment, monitor network connections during first runs, and verify model sources/ licences (and that they are allowed for your use case).
Capability Analysis
Type: OpenClaw Skill
Name: qwen3-tts-mlx
Version: 2.1.0
The OpenClaw AgentSkills bundle for Qwen3-TTS MLX appears benign. The `SKILL.md` provides clear instructions for installation and usage, without any hidden commands or prompt injection attempts targeting the AI agent. The Python scripts (`scripts/batch_dubbing.py`, `scripts/run_tts.py`) perform text-to-speech generation, handling input text and output audio files as expected. They utilize standard libraries like `mlx-audio`, `soundfile`, and `numpy`, and download models from `mlx-community`, which is a legitimate source for MLX models. There is no evidence of data exfiltration, unauthorized command execution, persistence mechanisms, or obfuscation.
Capability Assessment
Purpose & Capability
Name/description and the scripts align with a local TTS tool using mlx-audio and MLX models; however the SKILL.md emphasizes 'offline' usage while the code calls generate_audio with model names (e.g. mlx-community/...) that will typically be fetched from remote model repositories (Hugging Face/MLX) unless pre-downloaded. The documentation does not explain model download, disk/storage needs (~1–2+ GB per model), or how to operate fully offline, which is a meaningful mismatch.
Instruction Scope
The SKILL.md instructs installing mlx-audio and ffmpeg and running the provided scripts; the scripts read user JSON and audio files and write outputs (expected). The runtime instructions do not disclose that generate_audio may download models or contact remote endpoints; that omission gives the agent broader network/IO behavior than the 'offline' claim implies. The batch script also monkeypatches transformers.AutoTokenizer to set a fix flag — a side-effect that mutates third-party library behavior in-process (benign but notable).
Install Mechanism
There is no formal install spec; SKILL.md recommends 'pip install mlx-audio' and 'brew install ffmpeg'. Using pip means pulling code from PyPI (or whichever index the environment uses). This is an expected pattern for Python scripts but carries typical supply-chain risks: the package 'mlx-audio' and its dependencies should be reviewed and/or pinned. No arbitrary download URLs or archived extracts are included in the skill files themselves.
Credentials
The skill requests no environment variables, credentials, or config paths. The scripts operate only on user-provided files and local outputs, which is proportionate for a TTS tool.
Persistence & Privilege
The skill is not always-enabled and does not request persistent elevated privileges or attempt to modify other skills or system-wide agent settings. It runs as a user-invoked CLI tool — expected behavior.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install qwen3-tts-mlx - After installation, invoke the skill by name or use
/qwen3-tts-mlx - Provide required inputs per the skill's parameter spec and get structured output
Version History
v2.1.0
- Major update with expanded documentation and feature explanations for Qwen3-TTS MLX use on Apple Silicon.
- Details modes for built-in voices with style control, voice design from text, and voice cloning.
- Provides step-by-step quick start, install, and usage instructions for both CLI and Python API.
- Lists supported languages, built-in voices, model variants, and recommended scenarios.
- Adds guides for batch processing, troubleshooting, and performance benchmarks.
- Includes in-depth parameter explanations and practical command examples for all features.
Metadata
Frequently Asked Questions
What is Qwen3 Tts Mlx?
Local Qwen3-TTS speech synthesis on Apple Silicon via MLX. Use for offline narration, audiobooks, video voiceovers, and multilingual TTS. It is an AI Agent Skill for Claude Code / OpenClaw, with 332 downloads so far.
How do I install Qwen3 Tts Mlx?
Run "/install qwen3-tts-mlx" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Qwen3 Tts Mlx free?
Yes, Qwen3 Tts Mlx is completely free (open-source). You can download, install and use it at no cost.
Which platforms does Qwen3 Tts Mlx support?
Qwen3 Tts Mlx is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Qwen3 Tts Mlx?
It is built and maintained by h1bomb (@h1bomb); the current version is v2.1.0.
More Skills