whatsappVoiceOpenSkill

Name: whatsappVoiceOpenSkill
Author: syedateebulislam

功能描述

Real-time WhatsApp voice message processing. Transcribe voice notes to text via Whisper, detect intent, execute handlers, and send responses. Use when building conversational voice interfaces for WhatsApp. Supports English and Hindi, customizable intents (weather, status, commands), automatic language detection, and streaming responses via TTS.

安全使用建议

What to check before installing or running this skill: - Audio format support: The README claims OGG/Opus works without FFmpeg, but transcribe.py uses soundfile/libsndfile. libsndfile often cannot read Opus-in-OGG; test with real WhatsApp files. If you see failures, add FFmpeg-based conversion or use a tool that supports Opus. - Language handling: transcribe.py forces language="en" when calling Whisper, which will hurt Hindi/other-language transcripts. The skill does language detection only after transcription. If you expect multi-language input, update transcribe.py to let Whisper detect language (or pass correct language param). - Missing/implicit dependencies: package.json lists no dependencies, yet code uses fetch (Node versions <18 may not have global fetch), and examples require drone-sdk or music-api — those are not provided. Ensure you install and pin the dependencies you actually need. - Network and device actions: built-in weather handler makes an HTTP request to wttr.in (expected). Example handlers show controlling drones or smart-home devices — these are only executed if you wire in such handlers, but be careful: adding handlers can introduce network/device access and will need appropriate credentials and safety checks. - Resource & security posture: Whisper models are large and memory- and CPU-intensive; running locally will download models and consume ~1+ GB RAM (per docs). Run in a sandboxed environment first; don’t point it at directories with sensitive files you wouldn’t want processed or uploaded. - Testing: Run the daemon in a controlled environment, feed a few sample WhatsApp voice files, and verify transcription/language detection. Inspect logs (.voice-processed.log) and ensure parent process handling of printed JSON will not leak data to unexpected endpoints. If you want, I can produce a short checklist of code fixes (e.g., remove language="en", add a fallback ffmpeg conversion, pin dependencies) and a minimal safe test plan to validate the skill in a sandbox.

功能分析

Type: OpenClaw Skill Name: whatsapp-voice-chat-integration-open-source Version: 1.0.0 The skill is classified as suspicious due to the use of high-risk capabilities, specifically `child_process.execSync` in `scripts/voice-processor.js` to execute a Python script, and an external network call via `fetch` to `https://wttr.in/Delhi?format=j1` in the same file. While these actions are plausibly aligned with the skill's stated purpose (transcription and weather information), `execSync` allows arbitrary command execution, and external network calls can be a vector for data exfiltration or C2 if abused. There is no clear evidence of intentional malicious behavior, but the presence of these powerful primitives without strict sandboxing warrants a 'suspicious' classification.

能力评估

⚠ Purpose & Capability

The code and docs implement transcription via a local Whisper model, intent parsing, and handlers — matching the stated purpose. However there are important mismatches: SKILL.md and docs claim OGG/Opus (WhatsApp) works “no-FFmpeg”, but the Python transcription uses soundfile/libsndfile — libsndfile typically does not support OGG/Opus/Opus-in-OGG without additional codecs/FFmpeg, so the "no FFmpeg" claim is likely incorrect. Also transcribe.py calls model.transcribe(..., language="en") (forces English) even though the skill advertises automatic multi-language detection; the pipeline actually detects language in JS after transcription, which contradicts claims of automatic language detection at the transcription stage.

ℹ Instruction Scope

Runtime instructions are narrowly scoped to watching ~/.clawdbot/media/inbound/, saving temp files under TEMP, running the bundled transcribe.py via child_process.execSync, parsing intents, and optionally making outbound HTTP requests (weather handler fetches wttr.in). The daemon prints JSON to stdout for a parent process to handle sending via WhatsApp. The instructions do not request external credentials, nor do they read arbitrary system config files, but they do read/write local files (.voice-processed.log and temp files) and spawn a Python process — both expected for a local transcription pipeline.

ℹ Install Mechanism

There is no install spec; SKILL.md and requirements.txt ask users to pip install openai-whisper, soundfile, numpy. Installing openai-whisper will download models and runtime dependencies (large downloads, potential CPU/GPU usage). This is a normal distribution mechanism but it is non-trivial (model downloads, large memory use). There are no remote downloads of arbitrary archives in an install script.

✓ Credentials

The skill declares no required environment variables or credentials (and none are necessary for the provided handlers). The code uses common env variables for paths (HOME/APPDATA/TEMP) only. Note: example/custom handlers reference external SDKs (drone-sdk, music-api) which, if enabled by a user, would require their own credentials — but those are optional user modifications, not required by the skill.

✓ Persistence & Privilege

always:false and the skill does not request persistent platform-wide privileges. It writes a local processed log and temporary audio files (expected for a daemon). It does not modify other skills or system-wide settings.

版本历史

v1.0.0

opensource skill setup for whatsapp voice chat with your bot

元数据

Slug whatsapp-voice-chat-integration-open-source

版本 1.0.0

许可证 —

累计安装 5

当前安装数 5

历史版本数 1

常见问题

whatsappVoiceOpenSkill 是什么？

Real-time WhatsApp voice message processing. Transcribe voice notes to text via Whisper, detect intent, execute handlers, and send responses. Use when building conversational voice interfaces for WhatsApp. Supports English and Hindi, customizable intents (weather, status, commands), automatic language detection, and streaming responses via TTS. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 2304 次。

如何安装 whatsappVoiceOpenSkill？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install whatsapp-voice-chat-integration-open-source」即可一键安装，无需额外配置。

whatsappVoiceOpenSkill 是免费的吗？

是的，whatsappVoiceOpenSkill 完全免费（开源免费），可自由下载、安装和使用。

whatsappVoiceOpenSkill 支持哪些平台？

whatsappVoiceOpenSkill 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 whatsappVoiceOpenSkill？

由 Syed Ateebul Islam（@syedateebulislam）开发并维护，当前版本 v1.0.0。

whatsappVoiceOpenSkill 是什么？

如何安装 whatsappVoiceOpenSkill？

whatsappVoiceOpenSkill 是免费的吗？

whatsappVoiceOpenSkill 支持哪些平台？

谁开发了 whatsappVoiceOpenSkill？

💬 留言讨论