← Back to Skills Marketplace
zhaov1976

Voice

by zhaov · GitHub ↗ · v1.0.1
cross-platform ⚠ suspicious
2937
Downloads
0
Stars
18
Active Installs
2
Versions
Install in OpenClaw
/install voice
Description
Convert text to speech using Microsoft Edge's TTS engine with customizable voices, direct playback, and automatic temporary file cleanup.
Usage Guidance
This skill appears to do what it says (edge-tts TTS + playback), but the implementation builds and executes shell command strings with user-provided text. That creates a real command-injection risk: a maliciously crafted input could execute arbitrary shell commands on the host. Before installing or enabling this skill in sensitive environments, consider the following: - Do not run it on production systems or hosts with sensitive data until reviewed or sandboxed. - Inspect and/or modify the code to avoid exec with a concatenated command string. Safer alternatives: - Use child_process.spawn with an argument array (no shell) or spawnFile so the text is passed as an argument rather than interpolated into a shell command. - Or call the Python API (edge-tts package) from a subprocess with structured arguments or via an RPC/worker, avoiding shell interpolation. - Properly escape or validate user text (but escaping is easy to get wrong; prefer avoiding the shell entirely). - Consider changing the temp directory to a skill-local, non-shared path and ensure it cannot traverse outside the skill folder. The code currently writes to path.join(__dirname, '..', '..', 'temp'), which may be broader than expected. - Avoid running the 'install' action automatically; perform dependency installation manually in a controlled environment. If you are not able to patch the code, run the skill only in an isolated sandbox or container and avoid giving it access to sensitive files or credentials.
Capability Analysis
Type: OpenClaw Skill Name: voice Version: 1.0.1 The skill is classified as suspicious due to a critical command injection vulnerability in the `playAudio` function within `index.js`. Specifically, when playing audio on Windows, the `filePath` parameter is directly embedded into a PowerShell command string without sufficient sanitization, allowing for arbitrary command execution if a malicious `filePath` is provided. The `SKILL.md` and `README.md` explicitly document the `play` action with a user-controlled `filePath`, making this vulnerability easily exploitable by a malicious agent or user.
Capability Assessment
Purpose & Capability
The name, SKILL.md, package.json and code all describe a TTS skill using edge-tts. Requested dependencies and behaviors (generate audio, play files, cleanup) are consistent with the stated purpose.
Instruction Scope
The runtime instructions and code run shell commands (execAsync) to call the edge-tts CLI and to install dependencies. The edge-tts invocation is built as a single shell command string that includes untrusted user text; because exec runs via a shell, constructs like $(...), `...`, or other shell metacharacters inside the text can result in arbitrary command execution (command injection). The skill also spawns system audio players and writes/cleans files under a temp directory two levels above the skill directory, which is surprising and should be reviewed.
Install Mechanism
There is no package install spec in the registry metadata, but the skill's code and SKILL.md instruct users (and provide an 'install' action) to run `pip3 install edge-tts`. Installing via pip is expected for this functionality, but runtime installation (exec of pip3) means the agent will perform network installs and execute whatever the installer does — acceptable for a TTS skill but worth noting.
Credentials
The skill requests no environment variables or credentials. No unrelated secrets are requested. The main risk is filesystem and shell invocation rather than excessive credential access.
Persistence & Privilege
The skill is not always-included and does not request elevated platform privileges. It doesn't modify other skills or global agent config. Its temporary file management and install action affect only local FS and pip.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install voice
  3. After installation, invoke the skill by name or use /voice
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.1
- Added CHANGELOG.md and README.md files for clearer documentation. - Updated skill features and usage: introduced direct speaking ("speak" action), playback ("play" action), and voice listing ("voices" action). - Enhanced control over file cleanup timing and playback behavior. - Updated supported options and voices to provide more flexibility and broader language support. - Revised and improved SKILL.md documentation to reflect these enhancements.
v1.0.0
Initial release of the Voice skill. - Adds text-to-speech conversion using Microsoft Edge's TTS engine. - Supports multiple voice options and customizable audio settings (voice, rate, volume, pitch). - Integrates with the MEDIA system for audio playback. - Automatically manages and cleans up temporary audio files. - Includes actions for both TTS and manual or scheduled file cleanup.
Metadata
Slug voice
Version 1.0.1
License
All-time Installs 19
Active Installs 18
Total Versions 2
Frequently Asked Questions

What is Voice?

Convert text to speech using Microsoft Edge's TTS engine with customizable voices, direct playback, and automatic temporary file cleanup. It is an AI Agent Skill for Claude Code / OpenClaw, with 2937 downloads so far.

How do I install Voice?

Run "/install voice" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Voice free?

Yes, Voice is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Voice support?

Voice is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Voice?

It is built and maintained by zhaov (@zhaov1976); the current version is v1.0.1.

💬 Comments