← 返回 Skills 市场
Voice Listener
作者
fanqing203
· GitHub ↗
· v0.1.0
427
总下载
0
收藏
1
当前安装
1
版本数
在 OpenClaw 中安装
/install voice-listener
功能描述
智能唤醒“小龙虾”,启用百度高准确度语音识别,持续监听并自动输入语音内容,支持“停止”暂停输入。
安全使用建议
This skill appears to do exactly what it claims: listen to your microphone, call Baidu's speech APIs using credentials you provide in baidu_config.json, and paste recognized text into wherever the cursor is. Before installing or running it: 1) Only provide your Baidu APP_ID/API_KEY/SECRET_KEY if you trust the code and (preferably) run it locally in a controlled environment. 2) Be aware it will simulate Ctrl+V keystrokes — do not run it while sensitive forms or password fields are focused. 3) The repository lists dependencies (sounddevice, numpy, keyboard, pyperclip, requests) but does not auto-install them; install them from trusted sources. 4) The script writes temporary WAV files (deleted), and sends audio to Baidu's official endpoints (token and vop.server_api). 5) If you need higher assurance, review the full voice_input_baidu_smart.py file yourself and run it in a sandboxed account or VM first. If you want me to, I can list the exact commands to install the dependencies and run the skill in an isolated environment.
功能分析
Type: OpenClaw Skill
Name: voice-listener
Version: 0.1.0
The skill bundle implements a voice-to-text assistant using the Baidu Speech API. It is classified as suspicious because it utilizes several high-risk capabilities, including microphone recording (sounddevice), clipboard access (pyperclip), and keyboard simulation (keyboard) to inject recognized text into the active window. While these are necessary for the stated functionality, the direct injection of unsanitized strings from an external API into the keyboard buffer presents a potential risk for accidental command execution. Furthermore, the scripts contain hardcoded local environment paths (e.g., referencing 'C:\Users\11666' in voice_input_baidu_smart.py and skill.json), which is a security anti-pattern, though no evidence of intentional malice or unauthorized data exfiltration was identified.
能力评估
Purpose & Capability
The skill's name/description (Baidu speech recognition + wake word + auto-input) matches the provided code and docs. The code reads a local baidu_config.json (APP_ID/API_KEY/SECRET_KEY) and calls Baidu token and speech endpoints; it uses sounddevice for audio, keyboard for keystroke simulation, pyperclip for clipboard — all appropriate for the stated purpose. One minor oddity: a WORKSPACE path is defined (C:\Users\11666\.openclaw\workspace) but not used for network credentials; this appears to be an environment artifact from the author rather than required functionality.
Instruction Scope
SKILL.md and the scripts only instruct running local Python scripts or a .bat and editing baidu_config.json. The runtime instructions cause continuous microphone capture and automatic pasting of recognized text into the current cursor position — this is coherent with the skill but has obvious privacy/usability implications (it will paste into whatever window has focus). The instructions do not attempt to read unrelated config or secret stores or POST data to unexpected endpoints; network calls are limited to Baidu's documented APIs.
Install Mechanism
There is no automated install step and no downloads from arbitrary URLs; the repository is instruction + code files. No install spec means nothing will be pulled silently during install. Users must manually install dependencies (sounddevice, numpy, keyboard, pyperclip, requests) — the package does not declare them as required env vars, but package.json and READMEs list them as tech stack.
Credentials
The skill requests Baidu API credentials via a local baidu_config.json (APP_ID/API_KEY/SECRET_KEY), which is exactly what a Baidu REST-based recognizer needs. It does not request unrelated cloud keys, tokens, or system credentials. Note: the skill does require access to the microphone and the ability to synthesize keyboard events (keyboard module), which are legitimate for its purpose but require user privileges.
Persistence & Privilege
always is false and the skill does not modify other skills or system-wide agent settings. It runs as a normal user process and creates temporary WAV files (deleted promptly). The skill will run continuously while active and simulate keystrokes — this is necessary for its function but increases blast radius if misused (e.g., if left running when entering passwords).
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install voice-listener - 安装完成后,直接呼叫该 Skill 的名称或使用
/voice-listener触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v0.1.0
Initial release with smart voice recognition and wake-up features:
- Added Baidu Speech Recognition integration for high-accuracy voice input.
- Introduced smart wake and pause with custom keywords ("小龙虾" to activate, "停止" to pause).
- Enabled continuous voice-to-input after activation, without repeating wake word.
- Provided multiple start options: OpenClaw command, batch script, or CLI.
- Included configurable API keys and wake/stop words.
- Added basic troubleshooting and API application guide.
元数据
常见问题
Voice Listener 是什么?
智能唤醒“小龙虾”,启用百度高准确度语音识别,持续监听并自动输入语音内容,支持“停止”暂停输入。 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 427 次。
如何安装 Voice Listener?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install voice-listener」即可一键安装,无需额外配置。
Voice Listener 是免费的吗?
是的,Voice Listener 完全免费(开源免费),可自由下载、安装和使用。
Voice Listener 支持哪些平台?
Voice Listener 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Voice Listener?
由 fanqing203(@fanqing203)开发并维护,当前版本 v0.1.0。
推荐 Skills