← Back to Skills Marketplace
Vocal Chat
by
Rubén Fernández Boullón
· GitHub ↗
· v1.0.0
3639
Downloads
12
Stars
21
Active Installs
1
Versions
Install in OpenClaw
/install vocal-chat
Description
Handles voice-to-voice conversations on WhatsApp. Automatically transcribes incoming audio and responds with local TTS audio. Use when the user wants to "talk" instead of type.
Usage Guidance
Before installing or enabling this skill, verify the following: (1) Confirm which binaries and scripts it requires (ffmpeg, whisper-cpp, sherpa-onnx-tts, tools/transcribe_voice.sh, bin/sherpa-onnx-tts) and install them from trusted sources — the manifest currently lists none. (2) Ensure your agent actually has a 'message' tool and WhatsApp integration set up and understand what credentials or API access that requires; the skill does not declare any credentials. (3) Ask the publisher to update the manifest to list required binaries, install instructions, and any needed credentials. (4) Consider running the skill in a sandbox or test account first — audio processing can involve sensitive content, and the skill assumes local filesystem access which could fail or be abused. (5) Note the performance constraint (RTF < 0.5) may be unrealistic for local models and could lead to degraded behavior; confirm resource needs. If the publisher cannot clarify these gaps, treat the skill as untrusted.
Capability Analysis
Type: OpenClaw Skill
Name: vocal-chat
Version: 1.0.0
The skill bundle describes a 'walkie-talkie' mode for an AI agent, enabling voice-to-voice conversations using local transcription and text-to-speech. The `SKILL.md` clearly outlines the workflow, triggers, and constraints, explicitly stating the use of 'local tools only' (ffmpeg, whisper-cpp, sherpa-onnx-tts). There is no evidence of intentional harmful behavior such as data exfiltration, malicious execution, persistence, or prompt injection attempts against the agent from the skill's instructions themselves. The instructions are aligned with the stated purpose and do not exhibit high-risk behaviors.
Capability Assessment
Purpose & Capability
The description (voice-to-voice on WhatsApp) is plausible, but the manifest declares no required binaries, no install steps, and no WhatsApp integration credentials or endpoints. The SKILL.md explicitly requires local tools (ffmpeg, whisper-cpp, sherpa-onnx-tts) and scripts (tools/transcribe_voice.sh, bin/sherpa-onnx-tts) which are not declared in the registry metadata. That mismatch is disproportionate to the claimed purpose and means the skill may fail or assume access it hasn't requested.
Instruction Scope
The instructions tell the agent to run local scripts and binaries and to send audio via a `message` tool, but they do not explain how incoming audio is surfaced to the agent, where the scripts come from, or what the `message` tool's required parameters/permissions are. The SKILL.md restricts use to 'local tools only' (no cloud) and asks the agent to always return both text and audio — no steps ask to read unrelated files or environment variables, but the instructions assume filesystem and binary access that aren't guaranteed.
Install Mechanism
There is no install spec (instruction-only), which lowers install risk. However, the skill depends on external binaries and scripts that would need to be present on the host. The lack of an install mechanism or references to known release sources means the agent or operator must manually install/verify those dependencies; that operational gap is noteworthy but not inherently malicious.
Credentials
The skill declares no environment variables or credentials, which is consistent with its claim to use local-only tools. However, because it targets WhatsApp conversations, the absence of any declared messaging/WhatsApp credential or integration details is suspicious — the skill assumes the agent has access to a messaging tool capable of sending files but doesn't declare what access is required.
Persistence & Privilege
The skill does not request always:true and uses default invocation settings. It does not attempt to modify system-wide settings in the provided instructions. No persistence or elevated platform privileges are requested in the manifest.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install vocal-chat - After installation, invoke the skill by name or use
/vocal-chat - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
- Updated skill.
Metadata
Frequently Asked Questions
What is Vocal Chat?
Handles voice-to-voice conversations on WhatsApp. Automatically transcribes incoming audio and responds with local TTS audio. Use when the user wants to "talk" instead of type. It is an AI Agent Skill for Claude Code / OpenClaw, with 3639 downloads so far.
How do I install Vocal Chat?
Run "/install vocal-chat" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Vocal Chat free?
Yes, Vocal Chat is completely free (open-source). You can download, install and use it at no cost.
Which platforms does Vocal Chat support?
Vocal Chat is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Vocal Chat?
It is built and maintained by Rubén Fernández Boullón (@rubenfb23); the current version is v1.0.0.
More Skills