← 返回 Skills 市场
alimostafaradwan

Gemini Voice Assistant

作者 Ali Mostafa Radwan · GitHub ↗ · v1.0.0
cross-platform ⚠ suspicious
688
总下载
1
收藏
1
当前安装
1
版本数
在 OpenClaw 中安装
/install gemini-voice-assistant
功能描述
Voice-to-voice AI assistant using Gemini Live API. Speak to the AI and get spoken responses. Use when you want to have natural voice conversations with an AI...
使用说明 (SKILL.md)

Gemini Voice Assistant

A voice-to-voice AI assistant powered by Google's Gemini Live API. Speak to the AI and it responds with natural-sounding voice.

Usage

Text Mode

cd ~/.openclaw/agents/kashif/skills/gemini-assistant && python3 handler.py "Your question or message"

Voice Mode

cd ~/.openclaw/agents/kashif/skills/gemini-assistant && python3 handler.py --audio /path/to/audio.ogg "optional context"

Response Format

The handler returns a JSON response:

{
  "message": "[[audio_as_voice]]\
MEDIA:/tmp/gemini_voice_xxx.ogg",
  "text": "Text response from Gemini"
}

Configuration

Set your Gemini API key:

export GEMINI_API_KEY="your-api-key-here"

Or create a .env file in the skill directory:

GEMINI_API_KEY=your-api-key-here

Model Options

The default model is gemini-2.5-flash-native-audio-preview-12-2025 for audio support.

To use a different model, edit handler.py:

MODEL = "gemini-2.0-flash-exp"  # For text-only

Requirements

  • google-genai>=1.0.0
  • numpy>=1.24.0
  • soundfile>=0.12.0
  • librosa>=0.10.0 (for audio input)
  • FFmpeg (for audio conversion)

Features

  • 🎙️ Voice input/output support
  • 💬 Text conversations
  • 🔧 Configurable system instructions
  • ⚡ Fast responses with Gemini Flash
安全使用建议
What to consider before installing: - Metadata mismatch: the registry metadata claims no required env vars, but skill.json and handler.py require GEMINI_API_KEY. Verify the source and ask the publisher to correct metadata before trusting the package. - Secrets: the skill will read a .env file in its directory and import values into the process environment if present. Do not put unrelated secrets in that .env file; only store the Gemini API key there if you accept the risk. - Network and privacy: the skill uses google-genai to connect to Google's Gemini service — any voice/text you send will go to Google's servers. If you have privacy concerns, do not use it with sensitive data. - Local files: the skill writes audio to /tmp/gemini_voice_<id>.ogg and removes an intermediate WAV file. OGG files may persist until cleared; consider automatic cleanup or a different output directory if multiple users share the system. - Dependencies and binaries: you must install the listed Python packages and ensure FFmpeg is available at the expected path (handler.py uses /usr/bin/ffmpeg). Confirm the google-genai package you install is the official one and review its network behavior. - Source trust: the skill has no homepage and an unknown source/owner. If you need strong assurance, request a verified source, a repository link, or an upstream release to inspect before running it with your API key. If those concerns are acceptable and you trust the publisher, the code itself is consistent with its stated functionality; otherwise treat this as untrusted until the metadata and provenance issues are resolved.
功能分析
Type: OpenClaw Skill Name: gemini-voice-assistant Version: 1.0.0 The skill is classified as suspicious due to a significant prompt injection vulnerability in `scripts/handler.py`. The `system_instruction` parameter, which can be provided via `request_data` or CLI arguments, is directly passed to the Gemini API without sanitization or validation. This allows an attacker or a compromised OpenClaw agent to inject arbitrary instructions into the LLM's system prompt, potentially manipulating the AI's behavior, extracting sensitive information, or causing unintended actions. While the skill itself does not contain malicious instructions, this design flaw creates a critical attack surface.
能力评估
Purpose & Capability
The handler.py implements a Gemini Live audio/text client, depends on google-genai and audio libraries, and uses ffmpeg for conversion — which is coherent with a 'Gemini Voice Assistant'. However the registry metadata provided to the evaluator claimed 'Required env vars: none' while skill.json and the code require GEMINI_API_KEY. That metadata mismatch is an inconsistency you should resolve before trusting the package source.
Instruction Scope
SKILL.md instructions map directly to the CLI entrypoint in handler.py. The runtime reads a .env file in the skill directory (documented) and uses GEMINI_API_KEY from the environment; it writes temporary audio to /tmp and invokes ffmpeg. The instructions do not attempt to read unrelated system files or send data to endpoints other than the Gemini API.
Install Mechanism
There is no automated install spec (instruction-only behavior plus a Python script). Dependencies are standard Python packages and FFmpeg is expected to be present on the host. No external archive downloads or custom installers are present in the skill bundle.
Credentials
Requiring a single GEMINI_API_KEY is proportionate to contacting Gemini. The code will also load any key-value pairs from a local .env file into the process environment (only if present), so any secrets stored there may be read by the skill — ensure that .env contains only the intended API key. The earlier registry claim of 'no env vars' contradicts the code and skill.json, which is concerning.
Persistence & Privilege
The skill does not request always:true and does not modify other skills or global config. It does create audio files under /tmp and leaves OGG output there; this is local persistence but not an elevated platform privilege.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install gemini-voice-assistant
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /gemini-voice-assistant 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release - Voice-to-voice AI assistant using Gemini Live API
元数据
Slug gemini-voice-assistant
版本 1.0.0
许可证
累计安装 1
当前安装数 1
历史版本数 1
常见问题

Gemini Voice Assistant 是什么?

Voice-to-voice AI assistant using Gemini Live API. Speak to the AI and get spoken responses. Use when you want to have natural voice conversations with an AI... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 688 次。

如何安装 Gemini Voice Assistant?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install gemini-voice-assistant」即可一键安装,无需额外配置。

Gemini Voice Assistant 是免费的吗?

是的,Gemini Voice Assistant 完全免费(开源免费),可自由下载、安装和使用。

Gemini Voice Assistant 支持哪些平台?

Gemini Voice Assistant 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Gemini Voice Assistant?

由 Ali Mostafa Radwan(@alimostafaradwan)开发并维护,当前版本 v1.0.0。

💬 留言讨论