Voice Assistant

Name: Voice Assistant
Author: charantejmandali18

Description

Real-time voice assistant for OpenClaw. Streams mic audio through configurable STT (Deepgram or ElevenLabs) into your OpenClaw agent, then speaks the response via configurable TTS (Deepgram Aura or ElevenLabs). Sub-2s time-to-first-audio with full streaming at every stage.

Usage Guidance

This package implements the described voice pipeline and will stream your microphone audio and transcripts to third-party STT/TTS services (Deepgram and/or ElevenLabs) and to whatever OpenClaw gateway URL you provide. Before installing: 1) Be aware you must supply API keys (DEEPGRAM_API_KEY and/or ELEVENLABS_API_KEY) and your OPENCLAW_GATEWAY_URL/OPENCLAW_MODEL — the registry metadata does NOT list these, so the manifest is misleading. 2) Only install if you trust the skill author and the third-party providers; audio and transcripts will leave your machine. 3) Inspect scripts/server.py locally (already included) and run it in a limited environment (local machine or sandbox) before granting broader access. 4) If you don’t want to expose real data, test with dummy keys and a local gateway first. 5) Consider updating the manifest to correctly declare required secrets (primaryEnv should reference the actual API key variable) or ask the publisher for clarification.

Capability Analysis

Type: OpenClaw Skill Name: voice-assistant Version: 0.1.0 The OpenClaw Voice Assistant skill is designed to provide a real-time voice interface. It runs a local FastAPI server (`scripts/server.py`) that handles audio streaming from the browser, interacts with external Speech-to-Text (STT) and Text-to-Speech (TTS) providers (Deepgram/ElevenLabs), and communicates with the OpenClaw gateway. The skill requires API keys for STT/TTS services, which are loaded from a local `.env` file. All network access (to STT/TTS APIs and the OpenClaw gateway) and file operations (reading `.env`, serving static files) are directly aligned with its stated purpose. The `SKILL.md` instructions guide the OpenClaw agent to perform setup and execution tasks (e.g., `cp .env.example .env`, `uv run scripts/server.py`), which are necessary for the skill's operation and do not exhibit prompt injection attempts for malicious ends. No evidence of data exfiltration, malicious execution, persistence mechanisms, or obfuscation for harmful intent was found.

Capability Assessment

⚠ Purpose & Capability

The code and SKILL.md implement a real-time STT→LLM→TTS voice pipeline (Deepgram/ElevenLabs + OpenClaw gateway), which matches the name/description. However the registry metadata is inconsistent: it declares no required env vars and lists VOICE_STT_PROVIDER as the primary credential, but the server actually expects and uses sensitive API keys (DEEPGRAM_API_KEY, ELEVENLABS_API_KEY) plus OPENCLAW_GATEWAY_URL/OPENCLAW_MODEL. The primaryEnv should point at a secret like DEEPGRAM_API_KEY/ELEVENLABS_API_KEY (not the provider selector). This mismatch is disproportionate and confusing.

✓ Instruction Scope

SKILL.md provides concrete runtime instructions (copy .env.example to .env, fill in API keys, run uv run scripts/server.py, open browser). The runtime instructions and server code only reference expected files (.env) and the OpenClaw gateway; they stream microphone audio to configured STT/TTS providers and the OpenClaw gateway as described. There are no instructions to read unrelated system files or exfiltrate secrets beyond the STT/TTS and gateway endpoints.

✓ Install Mechanism

Install spec is a single brew formula 'uv' which is a standard package-manager install path (lower risk). The skill includes Python code and a pyproject.toml declaring normal Python dependencies (fastapi, uvicorn, httpx, websockets). No arbitrary downloads, URL shorteners, or extracted remote archives are present in the provided install spec.

⚠ Credentials

The skill requires multiple sensitive environment variables at runtime (DEEPGRAM_API_KEY, ELEVENLABS_API_KEY, OPENCLAW_GATEWAY_URL, OPENCLAW_MODEL) but the registry metadata lists no required env vars and sets primaryEnv to VOICE_STT_PROVIDER (a non-secret). This is misleading: users will need to supply API keys for third-party STT/TTS providers and a gateway URL, but the manifest does not declare them. Requesting multiple third-party API keys is reasonable for a voice skill, but the metadata/manifest should reflect that clearly.

✓ Persistence & Privilege

The skill does not request always:true and does not modify other skills or system-wide settings. It runs as a local server and uses normal network connections to STT/TTS providers and the OpenClaw gateway. Autonomous invocation remains possible (platform default) but is not combined with unusual privileges here.

Version History

v0.1.0

Initial release: real-time voice interface with configurable STT (Deepgram/ElevenLabs) and TTS (Deepgram/ElevenLabs), sub-2s latency, barge-in support

Metadata

Slug voice-assistant

Version 0.1.0

License —

All-time Installs 17

Active Installs 15

Total Versions 1

Frequently Asked Questions

What is Voice Assistant?

Real-time voice assistant for OpenClaw. Streams mic audio through configurable STT (Deepgram or ElevenLabs) into your OpenClaw agent, then speaks the response via configurable TTS (Deepgram Aura or ElevenLabs). Sub-2s time-to-first-audio with full streaming at every stage. It is an AI Agent Skill for Claude Code / OpenClaw, with 1875 downloads so far.

How do I install Voice Assistant?

Run "/install voice-assistant" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Voice Assistant free?

Yes, Voice Assistant is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Voice Assistant support?

Voice Assistant is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Voice Assistant?

It is built and maintained by Charan Tej Mandali (@charantejmandali18); the current version is v0.1.0.

More Skills

What is Voice Assistant?

How do I install Voice Assistant?

Is Voice Assistant free?

Which platforms does Voice Assistant support?

Who created Voice Assistant?

💬 Comments