Voice Assistant

Name: Voice Assistant
Author: kurtivy

功能描述

Windows voice companion for OpenClaw. Custom wake word via Porcupine, local STT via faster-whisper, streamed responses over the gateway WebSocket, and ElevenLabs TTS with natural chime/thinking sounds. Supports multi-turn conversation with automatic follow-up listening, mic suppression to prevent feedback, and a system tray with pause/resume. Recommended voices: Matilda (XrExE9yKIg1WjnnlVkGX, free tier) or Ivy (MClEFoImJXBTgLwdLI5n, paid tier). Fully customizable wake word, voice, hotkey, and silence thresholds.

安全使用建议

This skill appears to do what it says: it listens on your microphone, transcribes speech locally, sends text to your configured OpenClaw gateway, and uses ElevenLabs to generate speech audio. Before installing: - Understand what keys you provide: ELEVENLABS_API_KEY will cause text sent to the assistant to be sent to ElevenLabs for TTS (possible privacy and billing implications). Use a dedicated API key if you’re concerned about data exposure or costs. - Keep GATEWAY_URL pointed to a local gateway (default is ws://127.0.0.1:18789). Do not supply a token that grants broad access to a remote gateway unless you trust that endpoint; the client authenticates as a node with operator.write in the example handshake. - PORCUPINE_ACCESS_KEY is required for wake-word; Picovoice keys may expire. - The code creates temporary WAV files and asset files under the skill folder; ensure you run it from a directory you control. - The skill installs third-party Python packages and will (on first run) download Whisper models (~100–150MB or larger for bigger models). Consider installing and running in a virtualenv or isolated environment. - If you need stronger guarantees, review the included Python files yourself (we inspected them) or run the assistant in a sandboxed/isolated VM to monitor network calls and file writes. Overall: coherent and consistent with its stated purpose, but treat the API keys and gateway token as sensitive and verify the gateway target before use.

功能分析

Type: OpenClaw Skill Name: openclaw-voice-assistant Version: 1.0.4 The skill bundle implements a voice assistant for OpenClaw, utilizing local speech-to-text (faster-whisper), wake word detection (Porcupine), and text-to-speech (ElevenLabs). All network communication is directed to the local OpenClaw gateway, ElevenLabs, and Picovoice, which are essential for its stated functionality. Sensitive API keys are loaded from `.env` and used appropriately by their respective client libraries. The `SKILL.md` provides user-facing instructions, including an optional manual step for auto-start, but contains no prompt injection attempts against the AI agent or evidence of stealthy persistence. All dependencies are legitimate, and no malicious execution, data exfiltration, or obfuscation was found.

能力评估

✓ Purpose & Capability

Name/description match the implementation. ELEVENLABS_API_KEY for TTS, PORCUPINE_ACCESS_KEY for Porcupine wake-word, and GATEWAY_URL/GATEWAY_TOKEN for the OpenClaw gateway are all required and used in the code. Requiring Python and listed pip packages is expected for a local Python assistant.

✓ Instruction Scope

SKILL.md instructions are concrete and limited to installing dependencies, populating .env, running the assistant, and optionally generating audio assets. Runtime code only reads .env (declared), listens to the microphone, connects to the configured gateway, calls ElevenLabs for TTS, and uses Picovoice and Whisper for wake/STT. There are no instructions to read unrelated system files or to transmit data to unexpected external endpoints beyond ElevenLabs and the configured gateway.

✓ Install Mechanism

No special install script or remote archive is pulled; the developer expects users to pip-install dependencies from requirements.txt. Dependencies are standard (pvporcupine, faster-whisper, elevenlabs, av, etc.). This has normal risk (third‑party packages, model downloads) but no unusual download URLs or extraction of arbitrary remote code.

ℹ Credentials

Requested environment variables (GATEWAY_URL, GATEWAY_TOKEN, ELEVENLABS_API_KEY, PORCUPINE_ACCESS_KEY) are all used by the code and justified by the skill's purpose. Caveats: ELEVENLABS_API_KEY is the primary credential and will cause assistant text to be sent to ElevenLabs for TTS (possible privacy/cost implications). GATEWAY_TOKEN is used to authenticate to the OpenClaw gateway (the client requests operator role/scopes in the handshake), so if you point GATEWAY_URL to a remote gateway or provide a token with broad scopes it could carry more risk; by default GATEWAY_URL targets localhost.

✓ Persistence & Privilege

always:false and user-invocable:true (defaults) — the skill does not request permanent platform-level inclusion. It does offer optional auto-start guidance (creating a startup shortcut), but that is a user action rather than an automatic change. The skill does not modify other skills or system-wide agent settings.

版本历史

v1.0.4

Reduce gateway scope from operator.admin to operator.write (least privilege)

v1.0.3

Default to Matilda voice (free tier); recommend Ivy for paid. Free tier now supported.

v1.0.2

Improved description with implementation details and recommended voice

v1.0.1

Fix metadata: declare GATEWAY_TOKEN and GATEWAY_URL in requires.env, fix python3 -> python for Windows

v1.0.0

Initial release of OpenClaw Voice Assistant. - Enables always-on, hands-free voice control for OpenClaw on Windows PCs. - Features wake word detection (Porcupine), fast speech-to-text (faster-whisper), and natural text-to-speech (ElevenLabs). - Flexible configuration with .env for API keys, models, hotkeys, and tuning. - Custom wake word support and personalized chime/thinking sounds. - System tray app with background mode and Windows auto-start instructions. - Comprehensive troubleshooting and architecture documentation included.

元数据

Slug openclaw-voice-assistant

版本 1.0.4

许可证 —

累计安装 1

当前安装数 1

历史版本数 5

常见问题

Voice Assistant 是什么？

Windows voice companion for OpenClaw. Custom wake word via Porcupine, local STT via faster-whisper, streamed responses over the gateway WebSocket, and ElevenLabs TTS with natural chime/thinking sounds. Supports multi-turn conversation with automatic follow-up listening, mic suppression to prevent feedback, and a system tray with pause/resume. Recommended voices: Matilda (XrExE9yKIg1WjnnlVkGX, free tier) or Ivy (MClEFoImJXBTgLwdLI5n, paid tier). Fully customizable wake word, voice, hotkey, and silence thresholds. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 820 次。

如何安装 Voice Assistant？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install openclaw-voice-assistant」即可一键安装，无需额外配置。

Voice Assistant 是免费的吗？

是的，Voice Assistant 完全免费（开源免费），可自由下载、安装和使用。

Voice Assistant 支持哪些平台？

Voice Assistant 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（win32）。

谁开发了 Voice Assistant？

由 kurtivy（@kurtivy）开发并维护，当前版本 v1.0.4。

Voice Assistant 是什么？

如何安装 Voice Assistant？

Voice Assistant 是免费的吗？

Voice Assistant 支持哪些平台？

谁开发了 Voice Assistant？

💬 留言讨论