← 返回 Skills 市场
h1bomb

Qwen3 Tts Mlx

作者 h1bomb · GitHub ↗ · v2.1.0
cross-platform ⚠ suspicious
332
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install qwen3-tts-mlx
功能描述
Local Qwen3-TTS speech synthesis on Apple Silicon via MLX. Use for offline narration, audiobooks, video voiceovers, and multilingual TTS.
安全使用建议
This package is coherent with a local TTS tool, but take these precautions before installing: - Expect network activity and large downloads unless you pre-download models: the scripts refer to model identifiers (mlx-community/...), which are normally fetched from remote model hubs. If you truly need offline-only operation, confirm/model-downloads and pre-stage the weights locally. - Review and pin the 'mlx-audio' package (and its dependencies) before pip install to reduce supply-chain risk. Consider installing in a virtualenv or sandbox. - Be prepared for significant disk usage (models are ~1–2+ GB each) and possible memory/GPU requirements on your Mac. - Voice cloning processes user audio; consider privacy and consent implications before processing other people's voice samples. - The batch script monkeypatches transformers.AutoTokenizer to set a flag — this mutates third-party behavior in-process. It's likely harmless but you may want to review or remove that patch if you prefer not to change library internals. - If you need stronger assurance, inspect or run the code in an isolated environment, monitor network connections during first runs, and verify model sources/ licences (and that they are allowed for your use case).
功能分析
Type: OpenClaw Skill Name: qwen3-tts-mlx Version: 2.1.0 The OpenClaw AgentSkills bundle for Qwen3-TTS MLX appears benign. The `SKILL.md` provides clear instructions for installation and usage, without any hidden commands or prompt injection attempts targeting the AI agent. The Python scripts (`scripts/batch_dubbing.py`, `scripts/run_tts.py`) perform text-to-speech generation, handling input text and output audio files as expected. They utilize standard libraries like `mlx-audio`, `soundfile`, and `numpy`, and download models from `mlx-community`, which is a legitimate source for MLX models. There is no evidence of data exfiltration, unauthorized command execution, persistence mechanisms, or obfuscation.
能力评估
Purpose & Capability
Name/description and the scripts align with a local TTS tool using mlx-audio and MLX models; however the SKILL.md emphasizes 'offline' usage while the code calls generate_audio with model names (e.g. mlx-community/...) that will typically be fetched from remote model repositories (Hugging Face/MLX) unless pre-downloaded. The documentation does not explain model download, disk/storage needs (~1–2+ GB per model), or how to operate fully offline, which is a meaningful mismatch.
Instruction Scope
The SKILL.md instructs installing mlx-audio and ffmpeg and running the provided scripts; the scripts read user JSON and audio files and write outputs (expected). The runtime instructions do not disclose that generate_audio may download models or contact remote endpoints; that omission gives the agent broader network/IO behavior than the 'offline' claim implies. The batch script also monkeypatches transformers.AutoTokenizer to set a fix flag — a side-effect that mutates third-party library behavior in-process (benign but notable).
Install Mechanism
There is no formal install spec; SKILL.md recommends 'pip install mlx-audio' and 'brew install ffmpeg'. Using pip means pulling code from PyPI (or whichever index the environment uses). This is an expected pattern for Python scripts but carries typical supply-chain risks: the package 'mlx-audio' and its dependencies should be reviewed and/or pinned. No arbitrary download URLs or archived extracts are included in the skill files themselves.
Credentials
The skill requests no environment variables, credentials, or config paths. The scripts operate only on user-provided files and local outputs, which is proportionate for a TTS tool.
Persistence & Privilege
The skill is not always-enabled and does not request persistent elevated privileges or attempt to modify other skills or system-wide agent settings. It runs as a user-invoked CLI tool — expected behavior.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install qwen3-tts-mlx
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /qwen3-tts-mlx 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v2.1.0
- Major update with expanded documentation and feature explanations for Qwen3-TTS MLX use on Apple Silicon. - Details modes for built-in voices with style control, voice design from text, and voice cloning. - Provides step-by-step quick start, install, and usage instructions for both CLI and Python API. - Lists supported languages, built-in voices, model variants, and recommended scenarios. - Adds guides for batch processing, troubleshooting, and performance benchmarks. - Includes in-depth parameter explanations and practical command examples for all features.
元数据
Slug qwen3-tts-mlx
版本 2.1.0
许可证
累计安装 0
当前安装数 0
历史版本数 1
常见问题

Qwen3 Tts Mlx 是什么?

Local Qwen3-TTS speech synthesis on Apple Silicon via MLX. Use for offline narration, audiobooks, video voiceovers, and multilingual TTS. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 332 次。

如何安装 Qwen3 Tts Mlx?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install qwen3-tts-mlx」即可一键安装,无需额外配置。

Qwen3 Tts Mlx 是免费的吗?

是的,Qwen3 Tts Mlx 完全免费(开源免费),可自由下载、安装和使用。

Qwen3 Tts Mlx 支持哪些平台?

Qwen3 Tts Mlx 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Qwen3 Tts Mlx?

由 h1bomb(@h1bomb)开发并维护,当前版本 v2.1.0。

💬 留言讨论