← 返回 Skills 市场
Voice Assistant
作者
Charan Tej Mandali
· GitHub ↗
· v0.1.0
1875
总下载
4
收藏
15
当前安装
1
版本数
在 OpenClaw 中安装
/install voice-assistant
功能描述
Real-time voice assistant for OpenClaw. Streams mic audio through configurable STT (Deepgram or ElevenLabs) into your OpenClaw agent, then speaks the response via configurable TTS (Deepgram Aura or ElevenLabs). Sub-2s time-to-first-audio with full streaming at every stage.
安全使用建议
This package implements the described voice pipeline and will stream your microphone audio and transcripts to third-party STT/TTS services (Deepgram and/or ElevenLabs) and to whatever OpenClaw gateway URL you provide. Before installing: 1) Be aware you must supply API keys (DEEPGRAM_API_KEY and/or ELEVENLABS_API_KEY) and your OPENCLAW_GATEWAY_URL/OPENCLAW_MODEL — the registry metadata does NOT list these, so the manifest is misleading. 2) Only install if you trust the skill author and the third-party providers; audio and transcripts will leave your machine. 3) Inspect scripts/server.py locally (already included) and run it in a limited environment (local machine or sandbox) before granting broader access. 4) If you don’t want to expose real data, test with dummy keys and a local gateway first. 5) Consider updating the manifest to correctly declare required secrets (primaryEnv should reference the actual API key variable) or ask the publisher for clarification.
功能分析
Type: OpenClaw Skill
Name: voice-assistant
Version: 0.1.0
The OpenClaw Voice Assistant skill is designed to provide a real-time voice interface. It runs a local FastAPI server (`scripts/server.py`) that handles audio streaming from the browser, interacts with external Speech-to-Text (STT) and Text-to-Speech (TTS) providers (Deepgram/ElevenLabs), and communicates with the OpenClaw gateway. The skill requires API keys for STT/TTS services, which are loaded from a local `.env` file. All network access (to STT/TTS APIs and the OpenClaw gateway) and file operations (reading `.env`, serving static files) are directly aligned with its stated purpose. The `SKILL.md` instructions guide the OpenClaw agent to perform setup and execution tasks (e.g., `cp .env.example .env`, `uv run scripts/server.py`), which are necessary for the skill's operation and do not exhibit prompt injection attempts for malicious ends. No evidence of data exfiltration, malicious execution, persistence mechanisms, or obfuscation for harmful intent was found.
能力评估
Purpose & Capability
The code and SKILL.md implement a real-time STT→LLM→TTS voice pipeline (Deepgram/ElevenLabs + OpenClaw gateway), which matches the name/description. However the registry metadata is inconsistent: it declares no required env vars and lists VOICE_STT_PROVIDER as the primary credential, but the server actually expects and uses sensitive API keys (DEEPGRAM_API_KEY, ELEVENLABS_API_KEY) plus OPENCLAW_GATEWAY_URL/OPENCLAW_MODEL. The primaryEnv should point at a secret like DEEPGRAM_API_KEY/ELEVENLABS_API_KEY (not the provider selector). This mismatch is disproportionate and confusing.
Instruction Scope
SKILL.md provides concrete runtime instructions (copy .env.example to .env, fill in API keys, run uv run scripts/server.py, open browser). The runtime instructions and server code only reference expected files (.env) and the OpenClaw gateway; they stream microphone audio to configured STT/TTS providers and the OpenClaw gateway as described. There are no instructions to read unrelated system files or exfiltrate secrets beyond the STT/TTS and gateway endpoints.
Install Mechanism
Install spec is a single brew formula 'uv' which is a standard package-manager install path (lower risk). The skill includes Python code and a pyproject.toml declaring normal Python dependencies (fastapi, uvicorn, httpx, websockets). No arbitrary downloads, URL shorteners, or extracted remote archives are present in the provided install spec.
Credentials
The skill requires multiple sensitive environment variables at runtime (DEEPGRAM_API_KEY, ELEVENLABS_API_KEY, OPENCLAW_GATEWAY_URL, OPENCLAW_MODEL) but the registry metadata lists no required env vars and sets primaryEnv to VOICE_STT_PROVIDER (a non-secret). This is misleading: users will need to supply API keys for third-party STT/TTS providers and a gateway URL, but the manifest does not declare them. Requesting multiple third-party API keys is reasonable for a voice skill, but the metadata/manifest should reflect that clearly.
Persistence & Privilege
The skill does not request always:true and does not modify other skills or system-wide settings. It runs as a local server and uses normal network connections to STT/TTS providers and the OpenClaw gateway. Autonomous invocation remains possible (platform default) but is not combined with unusual privileges here.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install voice-assistant - 安装完成后,直接呼叫该 Skill 的名称或使用
/voice-assistant触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v0.1.0
Initial release: real-time voice interface with configurable STT (Deepgram/ElevenLabs) and TTS (Deepgram/ElevenLabs), sub-2s latency, barge-in support
元数据
常见问题
Voice Assistant 是什么?
Real-time voice assistant for OpenClaw. Streams mic audio through configurable STT (Deepgram or ElevenLabs) into your OpenClaw agent, then speaks the response via configurable TTS (Deepgram Aura or ElevenLabs). Sub-2s time-to-first-audio with full streaming at every stage. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 1875 次。
如何安装 Voice Assistant?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install voice-assistant」即可一键安装,无需额外配置。
Voice Assistant 是免费的吗?
是的,Voice Assistant 完全免费(开源免费),可自由下载、安装和使用。
Voice Assistant 支持哪些平台?
Voice Assistant 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Voice Assistant?
由 Charan Tej Mandali(@charantejmandali18)开发并维护,当前版本 v0.1.0。
推荐 Skills