← 返回 Skills 市场

Gemini Assistant

Name: Gemini Assistant
Author: alimostafaradwan

作者 Ali Mostafa Radwan · GitHub ↗ · v1.0.0

cross-platform ⚠ suspicious

589

总下载

当前安装

版本数

在 OpenClaw 中安装

/install gemini-assistant

功能描述

General-purpose AI assistant using Gemini API with voice and text support. Use when you need a smart AI assistant that can answer questions, have conversatio...

使用说明 (SKILL.md)

Gemini Assistant

A general-purpose AI assistant powered by Google's Gemini API. Supports both text and voice interactions.

Usage

Text Mode

cd ~/.openclaw/agents/kashif/skills/gemini-assistant && python3 handler.py "Your question or message"

Voice Mode

cd ~/.openclaw/agents/kashif/skills/gemini-assistant && python3 handler.py --audio /path/to/audio.ogg "optional context"

Response Format

The handler returns a JSON response:

{
  "message": "[[audio_as_voice]]\
MEDIA:/tmp/gemini_voice_xxx.ogg",
  "text": "Text response from Gemini"
}

Configuration

Set your Gemini API key:

export GEMINI_API_KEY="your-api-key-here"

Or create a .env file in the skill directory:

GEMINI_API_KEY=your-api-key-here

Model Options

The default model is gemini-2.5-flash-native-audio-preview-12-2025 for audio support.

To use a different model, edit handler.py:

MODEL = "gemini-2.0-flash-exp"  # For text-only

Requirements

google-genai>=1.0.0
numpy>=1.24.0
soundfile>=0.12.0
librosa>=0.10.0 (for audio input)
FFmpeg (for audio conversion)

Features

🎙️ Voice input/output support
💬 Text conversations
🔧 Configurable system instructions
⚡ Fast responses with Gemini Flash

安全使用建议

This skill appears to be an ordinary Gemini voice/text assistant, but there are multiple packaging/documentation mismatches you should clear up before use: 1) Confirm that GEMINI_API_KEY is required (skill.json and SKILL.md require it; the registry summary omitted it). 2) Be aware the handler auto-loads a .env file in the skill folder — don't store other secrets there. 3) The documented response format (including a 'text' field) doesn't match handler.py, which only returns audio MEDIA paths; callers expecting text will fail. 4) The skill requires ffmpeg on the host and will write OGG files to /tmp. 5) If you rely on the skill for both audio and text, either test it or modify handler.py to include text outputs (or update model/config to request TEXT modality). If you don't trust the source, inspect or run the code in a sandboxed environment and provide only a dedicated Gemini API key with limited scope rather than reuse broader credentials.

功能分析

Type: OpenClaw Skill Name: gemini-assistant Version: 1.0.0 The skill is classified as suspicious due to the user's ability to provide arbitrary `system_instruction` to the Gemini model via `handler.py`, which allows for prompt injection against the external AI service. While not directly compromising the host system or OpenClaw agent, this capability allows a user to manipulate the AI's behavior beyond its intended persona. Additionally, `handler.py` modifies the `LD_LIBRARY_PATH` environment variable for the `ffmpeg` subprocess, which, while likely a benign workaround, is a risky practice that could be exploited in a compromised environment.

能力评估

ℹ Purpose & Capability

The name/description (Gemini-based voice+text assistant) aligns with the code, which calls google.genai and handles audio/text. However metadata inconsistencies exist: the registry summary claims no required env vars while skill.json and SKILL.md require GEMINI_API_KEY. skill.json also advertises both AUDIO and TEXT modalities but the handler's runtime configuration requests only AUDIO. These mismatches reduce confidence in CI/packaging quality.

⚠ Instruction Scope

SKILL.md instructs setting GEMINI_API_KEY (and mentions a .env file) and running handler.py; the code does auto-load a .env file in the skill directory. SKILL.md shows an expected JSON response including a 'text' field, but handler.py does not populate a 'text' key — it only produces an audio MEDIA path or an error message. The discrepancy means the documented response format is inaccurate and callers expecting a 'text' field may break.

✓ Install Mechanism

There is no install spec (instruction-only install), and dependencies are standard Python/audio libraries (google-genai, numpy, librosa, soundfile). No remote downloads or obscure installers are used. The skill does call the system ffmpeg binary (/usr/bin/ffmpeg), which must exist on the host.

⚠ Credentials

The skill uses a single service credential (GEMINI_API_KEY), which is proportionate for a Gemini client. However the registry metadata asserted 'required env vars: none' while both SKILL.md and skill.json list GEMINI_API_KEY — this inconsistency is suspicious and should be clarified. The handler also auto-loads any .env file found in the skill directory, which may cause it to pick up secrets placed there.

✓ Persistence & Privilege

The skill does not request always:true, does not modify other skills or global configs, and only writes temporary files to /tmp and the skill directory (.env read). It sets an LD_LIBRARY_PATH for ffmpeg subprocess but does not persist system-wide changes.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install gemini-assistant
安装完成后，直接呼叫该 Skill 的名称或使用 /gemini-assistant 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

Initial release - General AI assistant using Gemini API with voice and text support

元数据

Slug gemini-assistant

版本 1.0.0

许可证 —

累计安装 0

当前安装数 0

历史版本数 1

常见问题