← Back to Skills Marketplace
Virtual voice builder
by
Suhas Rudra
· GitHub ↗
· v1.0.0
· MIT-0
115
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install virtual-voice-ai
Description
Wires a real microphone through an AI brain (STT → LLM → TTS) and routes the output to a virtual audio cable so apps like Google Meet hear the processed voic...
Usage Guidance
This skill appears to implement the described mic → AI → virtual-cable pipeline, but review these before installing: 1) Inspect and confirm required environment variables: ensure TTS_VOICE_ID is provided (code expects it) and update your .env only if you accept storing keys in a file. 2) The LLM env-check in scripts/04_llm_stream.js contains a bug that bypasses LLM_API_KEY validation; either fix that check or be prepared for runtime errors. 3) The SKILL.md was flagged for a 'system-prompt-override' pattern — read the file yourself and ensure it doesn't include instructions that would alter agent/system prompts or perform unexpected actions. 4) Run the scripts in a controlled environment first (not as root), test each stage independently (device listing, capture, STT, LLM stream, TTS), and avoid granting more privileges than necessary. 5) If you will use real API keys, consider using short-lived keys or scoped accounts and remove keys from .env after testing; monitor outbound connections (Deepgram, your LLM provider, and TTS provider endpoints) to confirm they match expectations. If any of these inconsistencies concern you or you don't want to store API keys in a project file, treat this skill as potentially risky until corrected.
Capability Analysis
Type: OpenClaw Skill
Name: virtual-voice-ai
Version: 1.0.0
The skill bundle provides a legitimate framework for building a real-time AI voice pipeline (STT → LLM → TTS) routed through virtual audio devices for use in applications like Google Meet. It utilizes ffmpeg for audio capture, resampling, and playback, and integrates with standard AI service providers (Deepgram, OpenAI, ElevenLabs). While the code utilizes powerful system capabilities such as audio device access and child process execution, the implementation is transparent, well-documented, and lacks any indicators of malicious intent, data exfiltration, or unauthorized persistence. A minor bug exists where 'scripts/02_capture_resample.js' is required by other scripts but is missing from the bundle, which is classified as a functional flaw rather than a security risk.
Capability Assessment
Purpose & Capability
The name/description match the code and scripts: audio capture via ffmpeg, Deepgram STT, an LLM call, sentence chunking, TTS and writing to a virtual cable. Required external services (Deepgram, LLM provider, TTS provider) are appropriate for the stated purpose. Minor mismatch: the index.js REQUIRED list includes TTS_VOICE_ID but the top-level registry-required env list omitted TTS_VOICE_ID; the repo references TTS_VOICE_ID in several places (scripts/06_tts_ws.js and package docs), so the registry metadata is incomplete.
Instruction Scope
SKILL.md instructs running the provided scripts, installing ffmpeg/virtual audio driver, and explicitly instructs to write API keys into the project's .env file. The code reads environment vars and opens WebSocket connections to external services (Deepgram, provider LLM endpoints, ElevenLabs/Cartesia). These actions are consistent with the function but the instruction to persist secrets into .env increases risk if users expect keys to remain only in memory. A pre-scan detected a 'system-prompt-override' pattern in SKILL.md — the file contains 'Critical Rules' and runtime guidance that could be interpreted as attempting to influence agent/system prompts; surface this for review.
Install Mechanism
There is no automated install that downloads arbitrary binaries; the package is instruction- and code-based (no external URL downloads or archive extraction). It depends on ffmpeg and the user installing a virtual audio driver; both are expected for this functionality.
Credentials
The skill requires API keys for STT, LLM, and TTS — which is proportional to its function. However there are two issues: (1) index.js and several scripts require TTS_VOICE_ID but the top-level registry 'required env' list omitted it (metadata mismatch). (2) scripts/04_llm_stream.js contains a bug in its env-check logic that effectively skips enforcing LLM_API_KEY (it filters REQUIRED with an expression that excludes 'LLM_API_KEY' from the check), meaning the code may attempt to run without the LLM key or behave unexpectedly — this is an implementation bug that could lead to confusing failures or accidental credential misuse. Also the SKILL.md explicitly tells the agent to write keys to a .env file, which may not be acceptable to all users.
Persistence & Privilege
The skill does not request permanent 'always:true' inclusion, does not modify other skills, and has no install step that alters system-wide agent settings. It exports a start/stop API and spawns child processes (ffmpeg) — appropriate for its role.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install virtual-voice-ai - After installation, invoke the skill by name or use
/virtual-voice-ai - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release of virtual-voice-ai skill.
- Guides users through building a real-time audio interception pipeline in Node.js using mic input, AI processing, and virtual audio output.
- Provides step-by-step instructions for OS and dependency setup, audio device discovery, and stress-tested script execution.
- Enforces key rules for audio handling, such as mandatory resampling, TTS buffering, and strict compatibility requirements.
- Includes a kill switch function for safely stopping the pipeline.
- Supplies environment variable requirements and progressive documentation for easy setup.
Metadata
Frequently Asked Questions
What is Virtual voice builder?
Wires a real microphone through an AI brain (STT → LLM → TTS) and routes the output to a virtual audio cable so apps like Google Meet hear the processed voic... It is an AI Agent Skill for Claude Code / OpenClaw, with 115 downloads so far.
How do I install Virtual voice builder?
Run "/install virtual-voice-ai" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Virtual voice builder free?
Yes, Virtual voice builder is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does Virtual voice builder support?
Virtual voice builder is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Virtual voice builder?
It is built and maintained by Suhas Rudra (@suhas12345685-pro); the current version is v1.0.0.
More Skills