← Back to Skills Marketplace
fanqing203

Voice Listener

by fanqing203 · GitHub ↗ · v0.1.0
cross-platform ⚠ suspicious
427
Downloads
0
Stars
1
Active Installs
1
Versions
Install in OpenClaw
/install voice-listener
Description
智能唤醒“小龙虾”,启用百度高准确度语音识别,持续监听并自动输入语音内容,支持“停止”暂停输入。
Usage Guidance
This skill appears to do exactly what it claims: listen to your microphone, call Baidu's speech APIs using credentials you provide in baidu_config.json, and paste recognized text into wherever the cursor is. Before installing or running it: 1) Only provide your Baidu APP_ID/API_KEY/SECRET_KEY if you trust the code and (preferably) run it locally in a controlled environment. 2) Be aware it will simulate Ctrl+V keystrokes — do not run it while sensitive forms or password fields are focused. 3) The repository lists dependencies (sounddevice, numpy, keyboard, pyperclip, requests) but does not auto-install them; install them from trusted sources. 4) The script writes temporary WAV files (deleted), and sends audio to Baidu's official endpoints (token and vop.server_api). 5) If you need higher assurance, review the full voice_input_baidu_smart.py file yourself and run it in a sandboxed account or VM first. If you want me to, I can list the exact commands to install the dependencies and run the skill in an isolated environment.
Capability Analysis
Type: OpenClaw Skill Name: voice-listener Version: 0.1.0 The skill bundle implements a voice-to-text assistant using the Baidu Speech API. It is classified as suspicious because it utilizes several high-risk capabilities, including microphone recording (sounddevice), clipboard access (pyperclip), and keyboard simulation (keyboard) to inject recognized text into the active window. While these are necessary for the stated functionality, the direct injection of unsanitized strings from an external API into the keyboard buffer presents a potential risk for accidental command execution. Furthermore, the scripts contain hardcoded local environment paths (e.g., referencing 'C:\Users\11666' in voice_input_baidu_smart.py and skill.json), which is a security anti-pattern, though no evidence of intentional malice or unauthorized data exfiltration was identified.
Capability Assessment
Purpose & Capability
The skill's name/description (Baidu speech recognition + wake word + auto-input) matches the provided code and docs. The code reads a local baidu_config.json (APP_ID/API_KEY/SECRET_KEY) and calls Baidu token and speech endpoints; it uses sounddevice for audio, keyboard for keystroke simulation, pyperclip for clipboard — all appropriate for the stated purpose. One minor oddity: a WORKSPACE path is defined (C:\Users\11666\.openclaw\workspace) but not used for network credentials; this appears to be an environment artifact from the author rather than required functionality.
Instruction Scope
SKILL.md and the scripts only instruct running local Python scripts or a .bat and editing baidu_config.json. The runtime instructions cause continuous microphone capture and automatic pasting of recognized text into the current cursor position — this is coherent with the skill but has obvious privacy/usability implications (it will paste into whatever window has focus). The instructions do not attempt to read unrelated config or secret stores or POST data to unexpected endpoints; network calls are limited to Baidu's documented APIs.
Install Mechanism
There is no automated install step and no downloads from arbitrary URLs; the repository is instruction + code files. No install spec means nothing will be pulled silently during install. Users must manually install dependencies (sounddevice, numpy, keyboard, pyperclip, requests) — the package does not declare them as required env vars, but package.json and READMEs list them as tech stack.
Credentials
The skill requests Baidu API credentials via a local baidu_config.json (APP_ID/API_KEY/SECRET_KEY), which is exactly what a Baidu REST-based recognizer needs. It does not request unrelated cloud keys, tokens, or system credentials. Note: the skill does require access to the microphone and the ability to synthesize keyboard events (keyboard module), which are legitimate for its purpose but require user privileges.
Persistence & Privilege
always is false and the skill does not modify other skills or system-wide agent settings. It runs as a normal user process and creates temporary WAV files (deleted promptly). The skill will run continuously while active and simulate keystrokes — this is necessary for its function but increases blast radius if misused (e.g., if left running when entering passwords).
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install voice-listener
  3. After installation, invoke the skill by name or use /voice-listener
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v0.1.0
Initial release with smart voice recognition and wake-up features: - Added Baidu Speech Recognition integration for high-accuracy voice input. - Introduced smart wake and pause with custom keywords ("小龙虾" to activate, "停止" to pause). - Enabled continuous voice-to-input after activation, without repeating wake word. - Provided multiple start options: OpenClaw command, batch script, or CLI. - Included configurable API keys and wake/stop words. - Added basic troubleshooting and API application guide.
Metadata
Slug voice-listener
Version 0.1.0
License
All-time Installs 1
Active Installs 1
Total Versions 1
Frequently Asked Questions

What is Voice Listener?

智能唤醒“小龙虾”,启用百度高准确度语音识别,持续监听并自动输入语音内容,支持“停止”暂停输入。 It is an AI Agent Skill for Claude Code / OpenClaw, with 427 downloads so far.

How do I install Voice Listener?

Run "/install voice-listener" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Voice Listener free?

Yes, Voice Listener is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Voice Listener support?

Voice Listener is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Voice Listener?

It is built and maintained by fanqing203 (@fanqing203); the current version is v0.1.0.

💬 Comments