← Back to Skills Marketplace
qwenspeak
by
Ciprian Mandache
· GitHub ↗
· v1.5.0
889
Downloads
0
Stars
3
Active Installs
6
Versions
Install in OpenClaw
/install qwenspeak
Description
Text-to-speech generation via Qwen3-TTS over SSH. Preset voices, voice cloning, voice design. Use when the user wants to generate speech audio, clone voices,...
Usage Guidance
This skill is coherent with being an SSH-based TTS client, but it has several red flags you should address before using it:
- It requires network access and an SSH identity: the included wrapper will call ssh tts@QWENSPEAK_HOST and therefore needs access to your SSH private key or agent. Consider creating a dedicated SSH keypair for this service and restricting that key on the server (command restrictions, limited account).
- The registry metadata omits required env vars (QWENSPEAK_HOST, QWENSPEAK_PORT); treat those as required. Expect other QWENSPEAK_* settings on the server side.
- references/setup.md suggests running a remote install script via curl | sudo bash. Do NOT run that as-is without reviewing the script content. Prefer to inspect the repository, clone it locally, and run only the commands you understand. Avoid piping unknown scripts to sudo.
- The skill exposes put/get file operations. If you give an agent this skill plus file system access, it could upload local files to the remote host. Limit the agent's file-access scope and ensure the remote host is trusted and isolated.
If you decide to proceed: review the GitHub install script before running, use a dedicated SSH key with restricted server-side permissions, host the QWENSPEAK instance on infrastructure you control or trust, and confirm that the registry metadata is updated to declare required env vars and any needed config paths.
Capability Analysis
Type: OpenClaw Skill
Name: qwenspeak
Version: 1.5.0
The `scripts/qwenspeak.sh` file contains a critical shell injection vulnerability (Remote Code Execution). It uses `exec ssh ... "$*"`, which directly passes all arguments from the local shell to the remote SSH server for execution without proper sanitization, allowing arbitrary commands to be run on the remote host as the `tts` user. This flaw, combined with the file manipulation capabilities described in `SKILL.md` (e.g., `put`, `get`, `remove-file`, `search-files`), could be exploited by a malicious prompt or compromised agent to exfiltrate data, install backdoors, or perform other unauthorized actions on the remote server.
Capability Assessment
Purpose & Capability
The name/description (Qwen3-TTS over SSH) matches the included script and commands. However, registry metadata claims no required env vars while SKILL.md and scripts clearly require QWENSPEAK_HOST and QWENSPEAK_PORT and rely on SSH keys; this metadata mismatch is inconsistent and should have been declared.
Instruction Scope
Runtime instructions direct the agent to interact with a remote host over SSH (tts@host), upload/download arbitrary files (put/get), and create reference audio. That is coherent for a TTS client, but it implicitly requires access to the user's SSH private key(s) and network access to the target host. The setup instructions also advise appending your public key to the server's authorized_keys. The skill permits file transfers which, if misused or combined with an untrusted remote host, could exfiltrate local data.
Install Mechanism
There is no formal install spec for the skill, but references/setup.md recommends running a remote installer via curl -fsSL https://raw.githubusercontent.com/psyb0t/docker-qwenspeak/main/install.sh | sudo bash. Download-and-pipe-to-sudo is high-risk: it writes files, manages authorized_keys, and installs a system command. The URL is a raw GitHub URL (better than an unknown personal server) but running arbitrary remote scripts as root should be reviewed manually before execution.
Credentials
Although the registry lists no required env vars, the SKILL.md and scripts require QWENSPEAK_HOST and QWENSPEAK_PORT; the setup references many QWENSPEAK_* env vars and persistence to ~/.qwenspeak/. The skill also implicitly requires the user's SSH private key (e.g., ~/.ssh) to authenticate to the remote service. Sensitive accesses (private keys, potential SSH agent use) are not declared in the registry metadata.
Persistence & Privilege
The skill does not request always:true and does not appear to modify other skills or global agent configuration. It is user-invocable and allows autonomous invocation (the platform default), which is expected for skills.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install qwenspeak - After installation, invoke the skill by name or use
/qwenspeak - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.5.0
**This release introduces a setup guide and major documentation updates.**
- Added dedicated installation and deployment instructions in references/setup.md.
- Simplified and clarified the main documentation, removing redundant detail.
- Job handling is now explicitly sequential (one at a time, queued).
- Job status terminology updated: jobs start as "queued" instead of "pending".
- Documentation improved for file operations, YAML options, and parameters.
- Housekeeping details added: jobs now auto-cleaned (1 day for completed, 1 week for all).
v1.4.0
make flash attention work goddammit
v1.3.0
qwenspeak 1.2.0
- Introduced a new `scripts/qwenspeak.sh` wrapper that automates SSH connection, port, and host key handling.
- Switched TTS job handling to an asynchronous workflow: jobs return a UUID immediately, with new commands to monitor, log, list, and cancel jobs.
- Improved file management commands with new, more descriptive names (e.g., `list-files`, `remove-file`, `create-dir`).
- Updated documentation for the new workflow, commands, and usage examples.
- Added enhanced job and file management capabilities with detailed status, metadata, and control options.
v1.2.0
- Added a new `tts log` subcommand to view TTS logs, with support for following (`-f`) and line count (`-n N`).
- Documented `/var/log/tts/` logging and provided log viewing instructions via SSH.
- Updated YAML pipeline example: removed `device` field from example config.
- No breaking changes; all usage remains backward compatible.
v1.1.0
**Major update: Adds YAML-driven TTS pipelines and batch generation support.**
- Introduces a YAML-based workflow: generate speech by piping a YAML config into the `tts` command for batch or multi-model jobs.
- New commands: `print-yaml` to output a config template; improved `list-speakers` and `tokenize` remain.
- Stepwise generation: one YAML config can describe multiple models, speakers, and voices in batches.
- Old direct CLI options replaced by flexible YAML settings; comprehensive config guide included.
- All previous file operations and preset speakers remain supported.
v1.0.0
Initial release of qwenspeak: text-to-speech over SSH with secure, containerized Qwen3-TTS.
- Generate speech using preset voices, voice design, or voice cloning.
- File operations (upload, download, directory management) via SSH with strict path isolation.
- Required: Configure SSH access and environment variables (`QWENSPEAK_HOST`, `QWENSPEAK_PORT`).
- Secure command execution—no shell access, all operations whitelisted in a Python wrapper.
- Includes detailed usage examples for TTS generation and file management.
Metadata
Frequently Asked Questions
What is qwenspeak?
Text-to-speech generation via Qwen3-TTS over SSH. Preset voices, voice cloning, voice design. Use when the user wants to generate speech audio, clone voices,... It is an AI Agent Skill for Claude Code / OpenClaw, with 889 downloads so far.
How do I install qwenspeak?
Run "/install qwenspeak" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is qwenspeak free?
Yes, qwenspeak is completely free (open-source). You can download, install and use it at no cost.
Which platforms does qwenspeak support?
qwenspeak is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created qwenspeak?
It is built and maintained by Ciprian Mandache (@psyb0t); the current version is v1.5.0.
More Skills