← 返回 Skills 市场
openclaw-voice
作者
frank-bot07
· GitHub ↗
· v1.0.0
722
总下载
2
收藏
4
当前安装
1
版本数
在 OpenClaw 中安装
/install openclaw-voice
功能描述
Transcribe audio to text and generate spoken AI responses using Whisper and ElevenLabs via CLI with transcript storage and search.
安全使用建议
What doesn't add up: the README and SKILL.md promise Whisper/ElevenLabs STT/TTS and a Twilio/real-time calling roadmap, but the shipped code only implements a CLI-backed SQLite transcript/profile manager and an 'interchange' MD generator. Before installing or supplying API keys:
- Ask the author which features are implemented now vs. planned. If you expect live STT/TTS or calling, confirm where that code lives and how it will be executed.
- Treat the interchange/voice directory as potentially public to other local skills: it writes MD summaries into the workspace root. If transcripts are sensitive, run the skill in an isolated workspace or change the output path.
- Don't provide Twilio/ElevenLabs/Anthropic keys until you see explicit code that uses them and you understand where audio/text will be sent and stored.
- If you plan to run npm install, be aware better-sqlite3 has native build steps (normal but requires build tooling).
Given the mismatches, proceed carefully and request clarification from the package owner; the inconsistencies look more like incomplete/unstable engineering than clearly malicious code, but they affect trust and data exposure.
功能分析
Type: OpenClaw Skill
Name: openclaw-voice
Version: 1.0.0
The skill contains multiple vulnerabilities. A path traversal vulnerability exists in `src/backup.js` (exposed via `src/cli.js`'s `backup` and `restore` commands), allowing user-controlled paths to potentially write files or create directories in arbitrary locations. More critically, `src/interchange.js` writes user-controlled data (conversation summaries and voice profile descriptions) directly into markdown files (`interchange/voice/state/recent.md` and `interchange/voice/ops/profiles.md`). Since `SKILL.md` and `README.md` explicitly state these interchange files are read by other AI agents, this creates a significant prompt injection vulnerability, allowing an attacker to inject malicious instructions into other agents.
能力评估
Purpose & Capability
The package description and SKILL.md claim Whisper STT, ElevenLabs TTS and (in v1.1) Twilio/Claude realtime call handling. The actual code provides CLI DB management, transcript storage, profile management, backups, and file-based interchange generation but contains no code that calls Whisper, ElevenLabs, Twilio, or external LLM APIs. Dependencies in package.json are only better-sqlite3, commander, and uuid. This is a substantive mismatch between claimed capabilities and implemented capabilities.
Instruction Scope
SKILL.md and VOICE_CALLING_SPEC.md describe use of child_process wrappers for sox/rec/ffplay, realtime WebSocket media servers, and many cloud API flows; none of those commands/APIs appear in the runtime code. The interchange generator writes .md files into a workspace-level 'interchange/voice' directory (three levels up from src), which will make conversation summaries available to other agents/tools on the same workspace. That file-write behavior is explicit and may expose transcripts or metadata beyond the local DB.
Install Mechanism
No external install script or remote downloads are declared; this is an instruction-plus-code skill that relies on standard npm packages (present in package.json and package-lock). There are no URLs or archive extracts in the install spec. Installing via npm would be the normal way to get native dependencies like better-sqlite3 (which has native build steps).
Credentials
Registry metadata lists no required env vars, but VOICE_CALLING_SPEC.md documents multiple sensitive environment variables (TWILIO_*, ELEVENLABS_API_KEY, ANTHROPIC_API_KEY, etc.) for planned features. The current code does not read those env vars, so requesting none is internally consistent for v1 — but the docs indicate future features that will require many secrets. Also, generateInterchange writes conversation summaries into a shared workspace directory; if you later enable networked TTS/STT or calling features, those transcripts could be shared externally if combined with other skills.
Persistence & Privilege
The skill does not request always:true and does not appear to modify other skills. It creates files and directories: data/voice.db, backups/, and workspace-level interchange/voice ops/state files. That gives it persistent disk state and the ability to expose data to other local skills via the interchange files — a functional but noteworthy level of presence.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install openclaw-voice - 安装完成后,直接呼叫该 Skill 的名称或使用
/openclaw-voice触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release: Voice interaction. 10 tests.
元数据
常见问题
openclaw-voice 是什么?
Transcribe audio to text and generate spoken AI responses using Whisper and ElevenLabs via CLI with transcript storage and search. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 722 次。
如何安装 openclaw-voice?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install openclaw-voice」即可一键安装,无需额外配置。
openclaw-voice 是免费的吗?
是的,openclaw-voice 完全免费(开源免费),可自由下载、安装和使用。
openclaw-voice 支持哪些平台?
openclaw-voice 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 openclaw-voice?
由 frank-bot07(@frank-bot07)开发并维护,当前版本 v1.0.0。
推荐 Skills