Voice Assistant

Name: Voice Assistant
Author: kurtivy

Description

Windows voice companion for OpenClaw. Custom wake word via Porcupine, local STT via faster-whisper, streamed responses over the gateway WebSocket, and ElevenLabs TTS with natural chime/thinking sounds. Supports multi-turn conversation with automatic follow-up listening, mic suppression to prevent feedback, and a system tray with pause/resume. Recommended voices: Matilda (XrExE9yKIg1WjnnlVkGX, free tier) or Ivy (MClEFoImJXBTgLwdLI5n, paid tier). Fully customizable wake word, voice, hotkey, and silence thresholds.

Usage Guidance

This skill appears to do what it says: it listens on your microphone, transcribes speech locally, sends text to your configured OpenClaw gateway, and uses ElevenLabs to generate speech audio. Before installing: - Understand what keys you provide: ELEVENLABS_API_KEY will cause text sent to the assistant to be sent to ElevenLabs for TTS (possible privacy and billing implications). Use a dedicated API key if you’re concerned about data exposure or costs. - Keep GATEWAY_URL pointed to a local gateway (default is ws://127.0.0.1:18789). Do not supply a token that grants broad access to a remote gateway unless you trust that endpoint; the client authenticates as a node with operator.write in the example handshake. - PORCUPINE_ACCESS_KEY is required for wake-word; Picovoice keys may expire. - The code creates temporary WAV files and asset files under the skill folder; ensure you run it from a directory you control. - The skill installs third-party Python packages and will (on first run) download Whisper models (~100–150MB or larger for bigger models). Consider installing and running in a virtualenv or isolated environment. - If you need stronger guarantees, review the included Python files yourself (we inspected them) or run the assistant in a sandboxed/isolated VM to monitor network calls and file writes. Overall: coherent and consistent with its stated purpose, but treat the API keys and gateway token as sensitive and verify the gateway target before use.

Capability Analysis

Type: OpenClaw Skill Name: openclaw-voice-assistant Version: 1.0.4 The skill bundle implements a voice assistant for OpenClaw, utilizing local speech-to-text (faster-whisper), wake word detection (Porcupine), and text-to-speech (ElevenLabs). All network communication is directed to the local OpenClaw gateway, ElevenLabs, and Picovoice, which are essential for its stated functionality. Sensitive API keys are loaded from `.env` and used appropriately by their respective client libraries. The `SKILL.md` provides user-facing instructions, including an optional manual step for auto-start, but contains no prompt injection attempts against the AI agent or evidence of stealthy persistence. All dependencies are legitimate, and no malicious execution, data exfiltration, or obfuscation was found.

Capability Assessment

✓ Purpose & Capability

Name/description match the implementation. ELEVENLABS_API_KEY for TTS, PORCUPINE_ACCESS_KEY for Porcupine wake-word, and GATEWAY_URL/GATEWAY_TOKEN for the OpenClaw gateway are all required and used in the code. Requiring Python and listed pip packages is expected for a local Python assistant.

✓ Instruction Scope

SKILL.md instructions are concrete and limited to installing dependencies, populating .env, running the assistant, and optionally generating audio assets. Runtime code only reads .env (declared), listens to the microphone, connects to the configured gateway, calls ElevenLabs for TTS, and uses Picovoice and Whisper for wake/STT. There are no instructions to read unrelated system files or to transmit data to unexpected external endpoints beyond ElevenLabs and the configured gateway.

✓ Install Mechanism

No special install script or remote archive is pulled; the developer expects users to pip-install dependencies from requirements.txt. Dependencies are standard (pvporcupine, faster-whisper, elevenlabs, av, etc.). This has normal risk (third‑party packages, model downloads) but no unusual download URLs or extraction of arbitrary remote code.

ℹ Credentials

Requested environment variables (GATEWAY_URL, GATEWAY_TOKEN, ELEVENLABS_API_KEY, PORCUPINE_ACCESS_KEY) are all used by the code and justified by the skill's purpose. Caveats: ELEVENLABS_API_KEY is the primary credential and will cause assistant text to be sent to ElevenLabs for TTS (possible privacy/cost implications). GATEWAY_TOKEN is used to authenticate to the OpenClaw gateway (the client requests operator role/scopes in the handshake), so if you point GATEWAY_URL to a remote gateway or provide a token with broad scopes it could carry more risk; by default GATEWAY_URL targets localhost.

✓ Persistence & Privilege

always:false and user-invocable:true (defaults) — the skill does not request permanent platform-level inclusion. It does offer optional auto-start guidance (creating a startup shortcut), but that is a user action rather than an automatic change. The skill does not modify other skills or system-wide agent settings.

Version History

v1.0.4

Reduce gateway scope from operator.admin to operator.write (least privilege)

v1.0.3

Default to Matilda voice (free tier); recommend Ivy for paid. Free tier now supported.

v1.0.2

Improved description with implementation details and recommended voice

v1.0.1

Fix metadata: declare GATEWAY_TOKEN and GATEWAY_URL in requires.env, fix python3 -> python for Windows

v1.0.0

Initial release of OpenClaw Voice Assistant. - Enables always-on, hands-free voice control for OpenClaw on Windows PCs. - Features wake word detection (Porcupine), fast speech-to-text (faster-whisper), and natural text-to-speech (ElevenLabs). - Flexible configuration with .env for API keys, models, hotkeys, and tuning. - Custom wake word support and personalized chime/thinking sounds. - System tray app with background mode and Windows auto-start instructions. - Comprehensive troubleshooting and architecture documentation included.

Metadata

Slug openclaw-voice-assistant

Version 1.0.4

License —

All-time Installs 1

Active Installs 1

Total Versions 5

Frequently Asked Questions

What is Voice Assistant?

Windows voice companion for OpenClaw. Custom wake word via Porcupine, local STT via faster-whisper, streamed responses over the gateway WebSocket, and ElevenLabs TTS with natural chime/thinking sounds. Supports multi-turn conversation with automatic follow-up listening, mic suppression to prevent feedback, and a system tray with pause/resume. Recommended voices: Matilda (XrExE9yKIg1WjnnlVkGX, free tier) or Ivy (MClEFoImJXBTgLwdLI5n, paid tier). Fully customizable wake word, voice, hotkey, and silence thresholds. It is an AI Agent Skill for Claude Code / OpenClaw, with 820 downloads so far.

How do I install Voice Assistant?

Run "/install openclaw-voice-assistant" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Voice Assistant free?

Yes, Voice Assistant is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Voice Assistant support?

Voice Assistant is cross-platform and runs anywhere OpenClaw / Claude Code is available (win32).

Who created Voice Assistant?

It is built and maintained by kurtivy (@kurtivy); the current version is v1.0.4.

More Skills

What is Voice Assistant?

How do I install Voice Assistant?

Is Voice Assistant free?

Which platforms does Voice Assistant support?

Who created Voice Assistant?

💬 Comments