← 返回 Skills 市场

Voice Agent

Name: Voice Agent
Author: ricardotrevisan

作者 Ricardo Trevisan · GitHub ↗ · v1.1.0

cross-platform ⚠ suspicious

3716

总下载

当前安装

版本数

在 OpenClaw 中安装

/install voice-agent

功能描述

Local Voice Input/Output for Agents using the AI Voice Agent API.

安全使用建议

This client-only skill is coherent, but before installing/using it: 1) ensure you actually run and trust the backend that must be reachable at http://localhost:8000 (the backend will handle Whisper and AWS Polly and will hold any cloud credentials); review the backend source or run it locally in an isolated environment. 2) Be aware the client uploads the audio file you specify to localhost and writes synthesized audio to the output path you provide — avoid pointing it at sensitive files or to paths where overwriting is a risk. 3) The client reads entire files into memory for upload, so very large files may cause memory pressure. 4) If you rely on production AWS credentials, ensure the backend stores and uses them securely (not this client). If you want extra assurance, inspect and run the backend code locally before connecting the skill to non-test data.

功能分析

Type: OpenClaw Skill Name: voice-agent Version: 1.1.0 The `scripts/client.py` file contains critical vulnerabilities that allow for arbitrary file read and write operations. The `transcribe` function reads the content of an arbitrary file path provided as an argument and sends it to `http://localhost:8000/transcribe`. Similarly, the `synthesize` function writes the generated audio content to an arbitrary file path provided as an argument. These flaws, while not explicitly malicious in intent, enable an attacker to read sensitive local files or write arbitrary content to any location on the filesystem, posing a significant risk for data exfiltration or local privilege escalation.

能力评估

✓ Purpose & Capability

The name/description (local voice I/O) matches the included client.py and SKILL.md: the skill is a file-based client that calls a local backend for Whisper STT and AWS Polly TTS. It does claim use of 'local Whisper' and 'AWS Polly' but those services are invoked by the backend at localhost:8000, not by the client — this is reasonable and proportionate for a client-only skill.

✓ Instruction Scope

SKILL.md clearly limits runtime behavior to running the provided client script (transcribe, synthesize, health) against the local backend and explicitly forbids service management. The client uploads user-selected audio files to http://localhost:8000 and writes synthesized audio to a user-specified output path. It does not read other system files or access environment variables beyond standard Python operation.

✓ Install Mechanism

There is no install spec (instruction-only) and included code is zero-dependency Python using the stdlib urllib — nothing is downloaded or installed automatically. This is low-risk from an install perspective.

ℹ Credentials

The skill declares no required env vars or credentials, which is consistent because the client talks to a local backend. However, SKILL.md mentions AWS Polly and local Whisper; those will require credentials/configuration in the backend (not in this package). Users should be aware the backend — not this client — will hold any cloud credentials.

✓ Persistence & Privilege

The skill is not marked always:true, does not persist or modify other skills, and does not request elevated privileges. It is user-invocable and uses the agent only when invoked.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install voice-agent
安装完成后，直接呼叫该 Skill 的名称或使用 /voice-agent 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.1.0

- Switched to client-only operation: service management and container startup are no longer included. - Now requires an external backend API running at http://localhost:8000; setup instructions moved to repo documentation. - Uses local Whisper for speech-to-text and AWS Polly for text-to-speech. - Updated documentation and workflows to clarify prerequisites and error handling. - Removed scripts/start.sh; skill no longer attempts to start backend services automatically.

v1.0.1

voice-agent 1.0.1 - Added documentation for starting the voice agent service if a health check fails or connection error occurs. - Updated example output file extension for synthesized audio from .ogg to .mp3 in documentation. - No functional code changes included.

v1.0.0

Initial release of the Voice Agent skill. - Enables local speech-to-text and text-to-speech interactions using the AI Voice Agent API. - Prioritizes audio input and output; responses to audio input are delivered primarily as audio files. - Provides clear workflow and guidelines to ensure seamless voice-based user interactions. - Includes easy-to-use commands for audio transcription, speech synthesis, and system health checks.

元数据

Slug voice-agent

版本 1.1.0

许可证 —

累计安装 28

当前安装数 27

历史版本数 3

常见问题

Voice Agent 是什么？

Local Voice Input/Output for Agents using the AI Voice Agent API. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 3716 次。

如何安装 Voice Agent？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install voice-agent」即可一键安装，无需额外配置。

Voice Agent 是免费的吗？

是的，Voice Agent 完全免费（开源免费），可自由下载、安装和使用。

Voice Agent 支持哪些平台？

Voice Agent 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Voice Agent？

由 Ricardo Trevisan（@ricardotrevisan）开发并维护，当前版本 v1.1.0。