← Back to Skills Marketplace
Gemini Tts
by
Zhe (Phil) Yang
· GitHub ↗
· v1.0.0
· MIT-0
401
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install gemini-tts
Description
Custom TTS using Gemini 2.5 Flash for high-quality, persona-driven voice output.
README (SKILL.md)
Gemini TTS Skill
This skill allows generating custom audio output using the Gemini Flash 2.5 model.
Usage
uv run ~/.openclaw/workspace/skills/gemini-tts/generate_voice.py --text "你的内容" --voice "little-claw-persona"
Features
- Custom voice generation
- Persona-driven tone
- High quality output
Configuration
- Needs GEMINI_API_KEY
Usage Guidance
This skill appears to be a simple Gemini TTS client, but there are a few red flags to address before installing: (1) The script and SKILL.md require GEMINI_API_KEY but the registry metadata does not declare it — assume you must supply a sensitive API key. Only provide a key with minimal permissions and rotate it after testing. (2) The script sends the API key in the URL query string, which can expose it in logs; prefer using an Authorization header (Bearer token). (3) The --voice argument is ignored and the code hard-codes a specific prebuilt voice — verify the voice selection works as you expect or update the code. (4) The script writes audio files to the current directory; run it in a sandboxed workspace to avoid accidental overwrites. (5) Confirm the endpoint (generativelanguage.googleapis.com) is the intended Google API for your account and that billing/quotas are acceptable. If you want to proceed, ask the publisher to update the registry metadata to declare GEMINI_API_KEY and to fix the hard-coded voice and API key handling so the skill’s declared requirements match its behavior.
Capability Analysis
Type: OpenClaw Skill
Name: gemini-tts
Version: 1.0.0
The skill is a legitimate implementation of a Text-to-Speech (TTS) generator using the Google Gemini API. The script `generate_voice.py` correctly handles environment variables for authentication, makes standard API calls to the official Google endpoint, and processes the resulting audio data without any signs of malicious intent, data exfiltration, or obfuscation.
Capability Assessment
Purpose & Capability
The name, description, SKILL.md and the code all indicate a Gemini TTS client (calls Google generativelanguage endpoint and writes audio to disk), so purpose and capability mostly align. However, the code hard-codes a prebuilt voice ('Puck') and ignores the --voice/voice_id argument, which is inconsistent with the advertised 'persona-driven' voice selection.
Instruction Scope
SKILL.md tells the agent to run the included script and states 'Needs GEMINI_API_KEY', but the registry metadata did not declare any required env vars. The script contacts a remote API (generativelanguage.googleapis.com), decodes base64 inline audio, and writes a local file (output_voice.*). These actions are consistent with TTS but the mismatch between SKILL.md, metadata, and the script is problematic and could lead to accidental exposure of credentials or unexpected behavior.
Install Mechanism
There is no install spec (instruction-only plus one script). That’s low install risk because nothing is automatically downloaded or written to disk during installation beyond the included files.
Credentials
The code and SKILL.md require GEMINI_API_KEY (sensitive). The skill metadata did not declare any required env vars or a primary credential, which is inconsistent and may hide the fact that you must provide a secret. Also the script places the API key in the URL query parameter ( ?key=... ), which can expose the key in logs and is less secure than using an Authorization header.
Persistence & Privilege
The skill does not request persistent presence (always is false) and does not attempt to modify other skills or system-wide config. It only writes output audio files to the working directory when run.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install gemini-tts - After installation, invoke the skill by name or use
/gemini-tts - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
- Initial release of gemini-tts skill.
- Provides high-quality, persona-driven text-to-speech using Gemini 2.5 Flash.
- Supports custom voice generation via command line.
- Requires GEMINI_API_KEY for configuration.
Metadata
Frequently Asked Questions
What is Gemini Tts?
Custom TTS using Gemini 2.5 Flash for high-quality, persona-driven voice output. It is an AI Agent Skill for Claude Code / OpenClaw, with 401 downloads so far.
How do I install Gemini Tts?
Run "/install gemini-tts" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Gemini Tts free?
Yes, Gemini Tts is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does Gemini Tts support?
Gemini Tts is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Gemini Tts?
It is built and maintained by Zhe (Phil) Yang (@yangzhe1991); the current version is v1.0.0.
More Skills