← 返回 Skills 市场

Gemini Tts

Name: Gemini Tts
Author: yangzhe1991

作者 Zhe (Phil) Yang · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ⚠ suspicious

401

总下载

当前安装

版本数

在 OpenClaw 中安装

/install gemini-tts

功能描述

Custom TTS using Gemini 2.5 Flash for high-quality, persona-driven voice output.

使用说明 (SKILL.md)

Gemini TTS Skill

This skill allows generating custom audio output using the Gemini Flash 2.5 model.

Usage

uv run ~/.openclaw/workspace/skills/gemini-tts/generate_voice.py --text "你的内容" --voice "little-claw-persona"

Features

Custom voice generation
Persona-driven tone
High quality output

Configuration

Needs GEMINI_API_KEY

安全使用建议

This skill appears to be a simple Gemini TTS client, but there are a few red flags to address before installing: (1) The script and SKILL.md require GEMINI_API_KEY but the registry metadata does not declare it — assume you must supply a sensitive API key. Only provide a key with minimal permissions and rotate it after testing. (2) The script sends the API key in the URL query string, which can expose it in logs; prefer using an Authorization header (Bearer token). (3) The --voice argument is ignored and the code hard-codes a specific prebuilt voice — verify the voice selection works as you expect or update the code. (4) The script writes audio files to the current directory; run it in a sandboxed workspace to avoid accidental overwrites. (5) Confirm the endpoint (generativelanguage.googleapis.com) is the intended Google API for your account and that billing/quotas are acceptable. If you want to proceed, ask the publisher to update the registry metadata to declare GEMINI_API_KEY and to fix the hard-coded voice and API key handling so the skill’s declared requirements match its behavior.

功能分析

Type: OpenClaw Skill Name: gemini-tts Version: 1.0.0 The skill is a legitimate implementation of a Text-to-Speech (TTS) generator using the Google Gemini API. The script `generate_voice.py` correctly handles environment variables for authentication, makes standard API calls to the official Google endpoint, and processes the resulting audio data without any signs of malicious intent, data exfiltration, or obfuscation.

能力评估

ℹ Purpose & Capability

The name, description, SKILL.md and the code all indicate a Gemini TTS client (calls Google generativelanguage endpoint and writes audio to disk), so purpose and capability mostly align. However, the code hard-codes a prebuilt voice ('Puck') and ignores the --voice/voice_id argument, which is inconsistent with the advertised 'persona-driven' voice selection.

⚠ Instruction Scope

SKILL.md tells the agent to run the included script and states 'Needs GEMINI_API_KEY', but the registry metadata did not declare any required env vars. The script contacts a remote API (generativelanguage.googleapis.com), decodes base64 inline audio, and writes a local file (output_voice.*). These actions are consistent with TTS but the mismatch between SKILL.md, metadata, and the script is problematic and could lead to accidental exposure of credentials or unexpected behavior.

✓ Install Mechanism

There is no install spec (instruction-only plus one script). That’s low install risk because nothing is automatically downloaded or written to disk during installation beyond the included files.

⚠ Credentials

The code and SKILL.md require GEMINI_API_KEY (sensitive). The skill metadata did not declare any required env vars or a primary credential, which is inconsistent and may hide the fact that you must provide a secret. Also the script places the API key in the URL query parameter ( ?key=... ), which can expose the key in logs and is less secure than using an Authorization header.

✓ Persistence & Privilege

The skill does not request persistent presence (always is false) and does not attempt to modify other skills or system-wide config. It only writes output audio files to the working directory when run.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install gemini-tts
安装完成后，直接呼叫该 Skill 的名称或使用 /gemini-tts 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

- Initial release of gemini-tts skill. - Provides high-quality, persona-driven text-to-speech using Gemini 2.5 Flash. - Supports custom voice generation via command line. - Requires GEMINI_API_KEY for configuration.

元数据

Slug gemini-tts

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

Gemini Tts 是什么？

Custom TTS using Gemini 2.5 Flash for high-quality, persona-driven voice output. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 401 次。

如何安装 Gemini Tts？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install gemini-tts」即可一键安装，无需额外配置。

Gemini Tts 是免费的吗？

是的，Gemini Tts 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

Gemini Tts 支持哪些平台？

Gemini Tts 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Gemini Tts？

由 Zhe (Phil) Yang（@yangzhe1991）开发并维护，当前版本 v1.0.0。