← Back to Skills Marketplace

Bidirectional Voice Chat System

Name: Bidirectional Voice Chat System
Author: patrickgeek

by patrickgeek · GitHub ↗ · v1.1.0

cross-platform ⚠ suspicious

426

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install voice-chat-bridge

Description

双向语音对话系统 - 语音识别转文字 + Edge TTS语音合成 + Cloudflare Tunnel公网访问

Usage Guidance

What to check before installing or running this skill: - Missing scripts: SKILL.md references hotkey_recorder.py, voice_chat_loop.py, chat.py and other runtime components that are not included. Ask the author for the missing files or disable features that call them. Running instructions that reference absent scripts will fail or be misleading. - Public exposure risk: If you enable 'ngrok' / 'cloudflared' modes or set a real domain, generated voice files under ~/.openclaw/workspace/voice_output will become reachable from the Internet. The bundled HTTP server suppresses access logs, so traffic may not be visible locally. Only expose this if you understand who can access the URLs and you are comfortable with voice data being public. - Credentials and tokens: The skill metadata does not declare any credentials, but ngrok/cloudflared require auth tokens (and Cloudflare Tunnel may require zone credentials). Manage those tokens carefully; do not paste them into untrusted code. The skill does not automatically upload data to any remote service in the provided code, but Edge TTS (edge-tts CLI) likely uses an online service — check its privacy policy. - Telemetry / monitoring: daily_monitor.py writes local reports and runs a local test that invokes generate_voice.py. It does not appear to exfiltrate telemetry, but the code refers to ClawHub stats ('需手动从 ClawHub 获取') without automated upload. If you are uncomfortable with local reports under ~/.openclaw/workspace/memory, inspect or remove that script. - Run in a sandbox first: Execute the scripts in a controlled environment (VM/container) to confirm behavior. Inspect generated URLs and verify that public-tunnel steps are manual and require your explicit tokens/configuration before you go public. - Review edge-tts & third-party binaries: edge-tts and 'hear' are third-party programs; verify their source, CLI behavior (whether they send audio/text to external servers), and install them intentionally. The SKILL does recommend fetching hear from GitHub releases — confirm checksums/limits before placing binaries into ~/.local/bin. If you want, I can enumerate the specific missing script names found in the SKILL.md and produce a minimal checklist of commands to safely test the local-only mode (server bound to localhost, no tunnels) in a sandbox.

Capability Analysis

Type: OpenClaw Skill Name: voice-chat-bridge Version: 1.1.0 The skill is classified as suspicious due to several risky capabilities and potential vulnerabilities, though without clear malicious intent. The `SKILL.md` contains instructions for the AI agent to modify its internal state (`connection` and `habits.json`), which is a form of prompt injection vulnerability. Additionally, the installation instructions in `SKILL.md` for the `hear` tool involve downloading and executing a binary from a remote URL (`curl -LO ... unzip ... cp`), posing a supply chain risk if the source were compromised. Finally, the skill leverages tools like Cloudflare Tunnel and Ngrok to expose local services to the public internet, a powerful capability that, while intended for benign purposes (serving voice files), introduces a significant attack surface if misused or if the exposed service were to become vulnerable.

Capability Assessment

✓ Purpose & Capability

Name/description, included scripts (transcribe, generate_voice, voice_server), and declared tools (ffmpeg, edge-tts, optional cloudflared/ngrok) are consistent with a bidirectional voice chat bridge that converts speech→text and text→speech and can serve files over HTTP.

⚠ Instruction Scope

SKILL.md refers to many runtime scripts and features (hotkey_recorder.py, voice_chat_loop.py, chat.py, chat-related behavior, habits.json updates, .voice_trigger file) that are not present in the package. It instructs users to open public tunnels (ngrok/cloudflared) and to serve voice files with a HTTP server that intentionally suppresses access logs — this combination raises privacy/exposure concerns because generated voice files could become publicly accessible without obvious logging. The instructions also instruct adding AGENTS.md behaviors (writing to habits.json and emotion updates) that are not implemented here.

ℹ Install Mechanism

No formal install spec (instruction-only with shipped scripts). The instructions recommend installing third-party binaries (ffmpeg, cloudflared, ngrok, hear) via brew/npm and downloading hear from a GitHub release — these are common but still involve executing fetched binaries. No archive downloads from obscure servers are present in the provided files.

ℹ Credentials

The skill requests no environment variables or credentials in metadata, which matches the included code. However, optional deployment modes (ngrok/cloudflared) require external tokens/credentials that are not listed or discussed in the skill metadata; daily_monitor mentions collecting 'installation data' but only writes local reports (no remote exfiltration in code).

ℹ Persistence & Privilege

The skill does not request always:true and does not modify other skills. It creates local state under ~/.openclaw/workspace and ~/.openclaw/workspace/memory. The HTTP server suppresses logging (QuietHTTPRequestHandler), which reduces visibility into external access when used with public tunnels — a design choice with privacy implications but not an explicit elevated privilege.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install voice-chat-bridge
After installation, invoke the skill by name or use /voice-chat-bridge
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.1.0

Voice Chat Bridge Skill 1.1.0 adds robust, flexible two-way voice assistant capabilities. - Supports bi-directional voice interaction: speech-to-text transcription + Edge TTS natural speech synthesis. - Multiple deployment options: local playback, web interface, Cloudflare Tunnel, Ngrok, and LocalTunnel for global access. - Integrates with Telegram, Discord, Slack, webhook, and supports both command-line and GUI modes. - Customizable voice, hotkey recording, language selection, and local/online recognition engine support. - Quick setup guides and full conversational loop for hands-free, voice-first experiences.

Metadata

Slug voice-chat-bridge

Version 1.1.0

License —

All-time Installs 1

Active Installs 1

Total Versions 1

Frequently Asked Questions

What is Bidirectional Voice Chat System?

双向语音对话系统 - 语音识别转文字 + Edge TTS语音合成 + Cloudflare Tunnel公网访问. It is an AI Agent Skill for Claude Code / OpenClaw, with 426 downloads so far.

How do I install Bidirectional Voice Chat System?

Run "/install voice-chat-bridge" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Bidirectional Voice Chat System free?

Yes, Bidirectional Voice Chat System is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Bidirectional Voice Chat System support?

Bidirectional Voice Chat System is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Bidirectional Voice Chat System?

It is built and maintained by patrickgeek (@patrickgeek); the current version is v1.1.0.

More Skills