← Back to Skills Marketplace
h1bomb

Qwen3 Tts Mlx

by h1bomb · GitHub ↗ · v2.1.0
cross-platform ⚠ suspicious
332
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install qwen3-tts-mlx
Description
Local Qwen3-TTS speech synthesis on Apple Silicon via MLX. Use for offline narration, audiobooks, video voiceovers, and multilingual TTS.
Usage Guidance
This package is coherent with a local TTS tool, but take these precautions before installing: - Expect network activity and large downloads unless you pre-download models: the scripts refer to model identifiers (mlx-community/...), which are normally fetched from remote model hubs. If you truly need offline-only operation, confirm/model-downloads and pre-stage the weights locally. - Review and pin the 'mlx-audio' package (and its dependencies) before pip install to reduce supply-chain risk. Consider installing in a virtualenv or sandbox. - Be prepared for significant disk usage (models are ~1–2+ GB each) and possible memory/GPU requirements on your Mac. - Voice cloning processes user audio; consider privacy and consent implications before processing other people's voice samples. - The batch script monkeypatches transformers.AutoTokenizer to set a flag — this mutates third-party behavior in-process. It's likely harmless but you may want to review or remove that patch if you prefer not to change library internals. - If you need stronger assurance, inspect or run the code in an isolated environment, monitor network connections during first runs, and verify model sources/ licences (and that they are allowed for your use case).
Capability Analysis
Type: OpenClaw Skill Name: qwen3-tts-mlx Version: 2.1.0 The OpenClaw AgentSkills bundle for Qwen3-TTS MLX appears benign. The `SKILL.md` provides clear instructions for installation and usage, without any hidden commands or prompt injection attempts targeting the AI agent. The Python scripts (`scripts/batch_dubbing.py`, `scripts/run_tts.py`) perform text-to-speech generation, handling input text and output audio files as expected. They utilize standard libraries like `mlx-audio`, `soundfile`, and `numpy`, and download models from `mlx-community`, which is a legitimate source for MLX models. There is no evidence of data exfiltration, unauthorized command execution, persistence mechanisms, or obfuscation.
Capability Assessment
Purpose & Capability
Name/description and the scripts align with a local TTS tool using mlx-audio and MLX models; however the SKILL.md emphasizes 'offline' usage while the code calls generate_audio with model names (e.g. mlx-community/...) that will typically be fetched from remote model repositories (Hugging Face/MLX) unless pre-downloaded. The documentation does not explain model download, disk/storage needs (~1–2+ GB per model), or how to operate fully offline, which is a meaningful mismatch.
Instruction Scope
The SKILL.md instructs installing mlx-audio and ffmpeg and running the provided scripts; the scripts read user JSON and audio files and write outputs (expected). The runtime instructions do not disclose that generate_audio may download models or contact remote endpoints; that omission gives the agent broader network/IO behavior than the 'offline' claim implies. The batch script also monkeypatches transformers.AutoTokenizer to set a fix flag — a side-effect that mutates third-party library behavior in-process (benign but notable).
Install Mechanism
There is no formal install spec; SKILL.md recommends 'pip install mlx-audio' and 'brew install ffmpeg'. Using pip means pulling code from PyPI (or whichever index the environment uses). This is an expected pattern for Python scripts but carries typical supply-chain risks: the package 'mlx-audio' and its dependencies should be reviewed and/or pinned. No arbitrary download URLs or archived extracts are included in the skill files themselves.
Credentials
The skill requests no environment variables, credentials, or config paths. The scripts operate only on user-provided files and local outputs, which is proportionate for a TTS tool.
Persistence & Privilege
The skill is not always-enabled and does not request persistent elevated privileges or attempt to modify other skills or system-wide agent settings. It runs as a user-invoked CLI tool — expected behavior.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install qwen3-tts-mlx
  3. After installation, invoke the skill by name or use /qwen3-tts-mlx
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v2.1.0
- Major update with expanded documentation and feature explanations for Qwen3-TTS MLX use on Apple Silicon. - Details modes for built-in voices with style control, voice design from text, and voice cloning. - Provides step-by-step quick start, install, and usage instructions for both CLI and Python API. - Lists supported languages, built-in voices, model variants, and recommended scenarios. - Adds guides for batch processing, troubleshooting, and performance benchmarks. - Includes in-depth parameter explanations and practical command examples for all features.
Metadata
Slug qwen3-tts-mlx
Version 2.1.0
License
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Qwen3 Tts Mlx?

Local Qwen3-TTS speech synthesis on Apple Silicon via MLX. Use for offline narration, audiobooks, video voiceovers, and multilingual TTS. It is an AI Agent Skill for Claude Code / OpenClaw, with 332 downloads so far.

How do I install Qwen3 Tts Mlx?

Run "/install qwen3-tts-mlx" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Qwen3 Tts Mlx free?

Yes, Qwen3 Tts Mlx is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Qwen3 Tts Mlx support?

Qwen3 Tts Mlx is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Qwen3 Tts Mlx?

It is built and maintained by h1bomb (@h1bomb); the current version is v2.1.0.

💬 Comments