← Back to Skills Marketplace
Walkie-Talkie Mode
by
Rubén Fernández Boullón
· GitHub ↗
· v1.0.0
1747
Downloads
1
Stars
2
Active Installs
1
Versions
Install in OpenClaw
/install walkie-talkie-mode
Description
Handles voice-to-voice conversations on WhatsApp. Automatically transcribes incoming audio and responds with local TTS audio. Use when the user wants to "talk" instead of type.
Usage Guidance
This skill's behavior (transcribe incoming audio, produce local TTS, send .ogg back) matches its description, but the SKILL.md depends on local tools and scripts that are not declared anywhere. Before installing or enabling: 1) Verify the agent environment actually has the required binaries (ffmpeg, whisper-cpp, sherpa-onnx-tts) and the helper script paths (tools/transcribe_voice.sh, bin/sherpa-onnx-tts). 2) Ask the author to update metadata to list required binaries, exact paths, and any model files or hardware needs. 3) Confirm the 'message' tool used to send files is the authorized platform tool (so audio is sent only to the intended chat) and that no unexpected external endpoints are contacted. 4) Review file permissions around /tmp and any model data to avoid exposing unrelated data. 5) Test in a sandboxed agent first — if the required local tools are missing, the skill will fail or may attempt to run arbitrary local programs if created later. If you cannot verify or supply the declared dependencies, treat this skill as untrusted.
Capability Analysis
Type: OpenClaw Skill
Name: walkie-talkie-mode
Version: 1.0.0
The skill bundle describes a 'walkie-talkie' mode for voice-to-voice conversations, primarily using local tools for transcription and text-to-speech. The `SKILL.md` file clearly outlines the workflow, triggers, and constraints, instructing the agent to execute local scripts (`tools/transcribe_voice.sh`) and binaries (`bin/sherpa-onnx-tts`) for its stated purpose. There is no evidence of prompt injection attempting to subvert the agent, exfiltrate data, establish persistence, or perform other malicious actions. The instructions are straightforward and align with the benign functionality described.
Capability Assessment
Purpose & Capability
The name/description (voice-to-voice WhatsApp) matches the SKILL.md workflow, but the skill metadata declares no required binaries, env vars, or installs while the instructions explicitly require local tooling (ffmpeg, whisper-cpp, sherpa-onnx-tts), a helper script (tools/transcribe_voice.sh), and a local TTS binary (bin/sherpa-onnx-tts). That inconsistency means the skill either omits necessary requirements or assumes access to arbitrary local executables.
Instruction Scope
Runtime instructions tell the agent to run local scripts/binaries and read/write files (e.g., /tmp/reply.ogg) and to use a 'message' tool to send files. These actions are coherent with the stated purpose, but they reference specific local paths and tools not declared in metadata. This grants the skill broad discretion to execute unspecified local programs and rely on local model artifacts.
Install Mechanism
There is no install spec (lowest install risk), which is fine for an instruction-only skill — but here it's problematic because the skill expects several local binaries and scripts. Because nothing will be installed by the skill, the operator must supply these dependencies; the missing install/dependency declarations are an integrity/usability risk.
Credentials
The skill requests no environment variables or credentials (appropriate). However, it implicitly requires access to local filesystem paths and local model binaries; the SKILL.md does not request or document any permissions or configuration for those resources.
Persistence & Privilege
The skill does not request always:true and does not declare persistent/system-wide changes. It appears to be user-invocable only and does not request elevated persistent privileges.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install walkie-talkie-mode - After installation, invoke the skill by name or use
/walkie-talkie-mode - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release of walkie-talkie-mode: enables seamless voice-to-voice conversations on WhatsApp.
- Automatically transcribes incoming WhatsApp audio messages using local tools.
- Generates voice note replies using local TTS and replies with both audio and text.
- Activates when users send audios or command with phrases like "activa modo walkie-talkie".
- Prioritizes fast, fully offline processing for privacy and speed.
- Includes manual execution instructions for internal use.
Metadata
Frequently Asked Questions
What is Walkie-Talkie Mode?
Handles voice-to-voice conversations on WhatsApp. Automatically transcribes incoming audio and responds with local TTS audio. Use when the user wants to "talk" instead of type. It is an AI Agent Skill for Claude Code / OpenClaw, with 1747 downloads so far.
How do I install Walkie-Talkie Mode?
Run "/install walkie-talkie-mode" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Walkie-Talkie Mode free?
Yes, Walkie-Talkie Mode is completely free (open-source). You can download, install and use it at no cost.
Which platforms does Walkie-Talkie Mode support?
Walkie-Talkie Mode is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Walkie-Talkie Mode?
It is built and maintained by Rubén Fernández Boullón (@rubenfb23); the current version is v1.0.0.
More Skills