← Back to Skills Marketplace

Whisper Piper Voice

Name: Whisper Piper Voice
Author: danielgrobelny

by DanielGrobelny · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ⚠ suspicious

108

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install whisper-piper-voice

Description

Set up and run a local voice pipeline combining Whisper STT (speech-to-text) and Piper TTS (text-to-speech) as a single HTTP server. Use when asked to set up...

Usage Guidance

This skill appears to do what it claims, but take these precautions before installing and running the server: 1) By default the server binds 0.0.0.0 and has no authentication — restrict it to localhost or put it behind a reverse proxy with auth (or firewall rules) if you will expose it on a network. 2) Run the service as an unprivileged user (as in the systemd example) and avoid running as root. 3) Verify downloaded artifacts (piper binary and voice models) come from the official project pages/releases and review license terms for model use. 4) Consider rate-limiting or network controls; the server spawns subprocesses (piper, ffmpeg) for each request and has no built-in quotas, so untrusted input or heavy traffic could exhaust resources. 5) If you need remote access, add TLS and authentication (reverse proxy or API key) — do not expose the plain HTTP endpoints to the open internet.

Capability Analysis

Type: OpenClaw Skill Name: whisper-piper-voice Version: 1.0.0 The skill implements a local STT/TTS HTTP server that contains several security vulnerabilities. Specifically, 'scripts/voice-server.py' uses the deprecated and insecure 'tempfile.mktemp()' function, which is susceptible to race conditions and symlink attacks. Additionally, the server binds to all network interfaces ('0.0.0.0') without any authentication mechanism, potentially exposing the host to unauthorized access. While these behaviors appear to be unintentional flaws rather than malicious intent, they represent significant security risks in a networked environment.

Capability Assessment

✓ Purpose & Capability

Name, description, SKILL.md, setup guide, and the included Python script all align: faster-whisper for STT, a Piper binary and ONNX voice model for TTS, and ffmpeg to produce Ogg/Opus output. Required downloads (GitHub releases, Hugging Face) and package installs are consistent with delivering a local offline pipeline.

ℹ Instruction Scope

Runtime instructions are focused on installing models/binaries, creating a venv, and running the bundled server. They also provide a systemd example to persist the service. The instructions do not request secrets or unrelated system access. However, the default server listens on 0.0.0.0 and no authentication or access control is documented; that is a security/operational concern (exposes the service to network access).

✓ Install Mechanism

There is no automated install spec; installation is manual via documented commands. Downloads are from GitHub releases and Hugging Face (well-known hosts). No obscure or shortener URLs and no archive extraction from unknown servers beyond those legitimate sources.

✓ Credentials

The skill requests no environment variables or credentials. The code and instructions do not attempt to read secrets or unrelated config paths. Required artifacts (piper binary, ONNX voice file, ffmpeg, faster-whisper) are appropriate for the stated functionality.

ℹ Persistence & Privilege

The SKILL.md/setup guide suggests enabling a systemd service for auto-start, which is a reasonable convenience for a local server. The skill does not set always:true and does not modify other skills. Still, running the server as a persistent service that binds to 0.0.0.0 increases attack surface; advise running as an unprivileged user and exposing only the intended interfaces.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install whisper-piper-voice
After installation, invoke the skill by name or use /whisper-piper-voice
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

Initial release: Combined Whisper STT + Piper TTS HTTP server. Single-port voice pipeline, runs fully local on CPU or GPU. Includes voice-server.py script, setup guide, and model recommendations.

Metadata

Slug whisper-piper-voice

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Whisper Piper Voice?

Set up and run a local voice pipeline combining Whisper STT (speech-to-text) and Piper TTS (text-to-speech) as a single HTTP server. Use when asked to set up... It is an AI Agent Skill for Claude Code / OpenClaw, with 108 downloads so far.

How do I install Whisper Piper Voice?

Run "/install whisper-piper-voice" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Whisper Piper Voice free?

Yes, Whisper Piper Voice is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Whisper Piper Voice support?

Whisper Piper Voice is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Whisper Piper Voice?

It is built and maintained by DanielGrobelny (@danielgrobelny); the current version is v1.0.0.

More Skills