← Back to Skills Marketplace

Gemini Tts

Name: Gemini Tts
Author: yangzhe1991

by Zhe (Phil) Yang · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ⚠ suspicious

401

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install gemini-tts

Description

Custom TTS using Gemini 2.5 Flash for high-quality, persona-driven voice output.

README (SKILL.md)

Gemini TTS Skill

This skill allows generating custom audio output using the Gemini Flash 2.5 model.

Usage

uv run ~/.openclaw/workspace/skills/gemini-tts/generate_voice.py --text "你的内容" --voice "little-claw-persona"

Features

Custom voice generation
Persona-driven tone
High quality output

Configuration

Needs GEMINI_API_KEY

Usage Guidance

This skill appears to be a simple Gemini TTS client, but there are a few red flags to address before installing: (1) The script and SKILL.md require GEMINI_API_KEY but the registry metadata does not declare it — assume you must supply a sensitive API key. Only provide a key with minimal permissions and rotate it after testing. (2) The script sends the API key in the URL query string, which can expose it in logs; prefer using an Authorization header (Bearer token). (3) The --voice argument is ignored and the code hard-codes a specific prebuilt voice — verify the voice selection works as you expect or update the code. (4) The script writes audio files to the current directory; run it in a sandboxed workspace to avoid accidental overwrites. (5) Confirm the endpoint (generativelanguage.googleapis.com) is the intended Google API for your account and that billing/quotas are acceptable. If you want to proceed, ask the publisher to update the registry metadata to declare GEMINI_API_KEY and to fix the hard-coded voice and API key handling so the skill’s declared requirements match its behavior.

Capability Analysis

Type: OpenClaw Skill Name: gemini-tts Version: 1.0.0 The skill is a legitimate implementation of a Text-to-Speech (TTS) generator using the Google Gemini API. The script `generate_voice.py` correctly handles environment variables for authentication, makes standard API calls to the official Google endpoint, and processes the resulting audio data without any signs of malicious intent, data exfiltration, or obfuscation.

Capability Assessment

ℹ Purpose & Capability

The name, description, SKILL.md and the code all indicate a Gemini TTS client (calls Google generativelanguage endpoint and writes audio to disk), so purpose and capability mostly align. However, the code hard-codes a prebuilt voice ('Puck') and ignores the --voice/voice_id argument, which is inconsistent with the advertised 'persona-driven' voice selection.

⚠ Instruction Scope

SKILL.md tells the agent to run the included script and states 'Needs GEMINI_API_KEY', but the registry metadata did not declare any required env vars. The script contacts a remote API (generativelanguage.googleapis.com), decodes base64 inline audio, and writes a local file (output_voice.*). These actions are consistent with TTS but the mismatch between SKILL.md, metadata, and the script is problematic and could lead to accidental exposure of credentials or unexpected behavior.

✓ Install Mechanism

There is no install spec (instruction-only plus one script). That’s low install risk because nothing is automatically downloaded or written to disk during installation beyond the included files.

⚠ Credentials

The code and SKILL.md require GEMINI_API_KEY (sensitive). The skill metadata did not declare any required env vars or a primary credential, which is inconsistent and may hide the fact that you must provide a secret. Also the script places the API key in the URL query parameter ( ?key=... ), which can expose the key in logs and is less secure than using an Authorization header.

✓ Persistence & Privilege

The skill does not request persistent presence (always is false) and does not attempt to modify other skills or system-wide config. It only writes output audio files to the working directory when run.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install gemini-tts
After installation, invoke the skill by name or use /gemini-tts
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

- Initial release of gemini-tts skill. - Provides high-quality, persona-driven text-to-speech using Gemini 2.5 Flash. - Supports custom voice generation via command line. - Requires GEMINI_API_KEY for configuration.

Metadata

Slug gemini-tts

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Gemini Tts?

Custom TTS using Gemini 2.5 Flash for high-quality, persona-driven voice output. It is an AI Agent Skill for Claude Code / OpenClaw, with 401 downloads so far.

How do I install Gemini Tts?

Run "/install gemini-tts" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Gemini Tts free?

Yes, Gemini Tts is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Gemini Tts support?

Gemini Tts is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Gemini Tts?

It is built and maintained by Zhe (Phil) Yang (@yangzhe1991); the current version is v1.0.0.

More Skills