Description

High-definition generative speech synthesis using Google Cloud Chirp 3 HD voices. Delivers superior realism, emotional expressiveness, and natural pacing usi...

README (SKILL.md)

\r \r

Google Chirp 3 HD TTS Skill\r

Name: Google Chirp 3 HD TTS Skill
Author: jarar21

\r

Overview\r

\r Generate ultra-realistic, human-like speech using Google's latest Chirp 3 HD generative models. This skill handles its own dependencies locally to remain portable.\r \r

Security Note: On first execution, this skill will run npm install locally within its own folder to fetch the official @google-cloud/text-to-speech library from the public npm registry.\r \r ---\r \r

Trigger Detection\r

\r Recognize keywords like "tts", "speak", "voice", or "read this out loud" as TTS requests.\r

Action: Extract the target text, strip the trigger keyword, apply "Natural Speech" formatting (see below), then call the tool using the auto-run logic below.\r \r ---\r \r

Auto-Run Logic (Agent Instructions)\r

\r Follow these steps every time TTS is needed. Never assume gtts is in PATH — always run via node directly.\r \r

Step 1 — Find the skill folder\r

gtts.js lives in the same folder as this SKILL.md. Resolve that path:\r \r

SKILL_DIR="$(cd "$(dirname "$0")" && pwd)"\r
```\r
\r
### Step 2 — Check Node.js (Version 18+ Required)\r
\r
```bash\r
if ! command -v node &>/dev/null; then\r
    echo "ERROR: Node.js is not installed."\r
    exit 1\r
fi\r
\r
NODE_MAJOR=$(node -v | cut -d'v' -f2 | cut -d'.' -f1)\r
if [ "$NODE_MAJOR" -lt 18 ]; then\r
    echo "ERROR: Node.js 18 or higher is required for Google Chirp 3 HD."\r
    exit 1\r
fi\r
```\r
\r
### Step 3 — Auto-install dependency if missing\r
This installs the package **locally inside the skill folder** only.\r
\r
```bash\r
if [ ! -d "$SKILL_DIR/node_modules/@google-cloud/text-to-speech" ]; then\r
    npm install @google-cloud/text-to-speech --prefix "$SKILL_DIR" --silent\r
fi\r
```\r
\r
### Step 4 — Run the script\r
\r
```bash\r
node "$SKILL_DIR/gtts.js" --text "$TEXT" --voice "$VOICE" --out "$OUTFILE"\r
```\r
\r
---\r
\r
## Command Arguments\r
\r
| Argument | Description | Default |\r
| :--- | :--- | :--- |\r
| `--text` | Text to synthesize. Supports `[pause]` tags. | *(required)* |\r
| `--voice` | Voice short-name (e.g. Aoede, Charon, Puck) | `Aoede` |\r
| `--out` | Output filename (saved to $OPENCLAW_WORKSPACE) | `output.mp3` |\r
\r
Returns `SUCCESS:/absolute/path/to/file.mp3` on success, or `ERROR: ...` on failure.\r
\r
---\r
\r
## Voice Selection\r
\r
| Gender | Recommended HD Voices |\r
| :--- | :--- |\r
| **Female** | Achernar (Default), Aoede, Leda, Kore, Zephyr, Despina, Gacrux, Vindemiatrix |\r
| **Male** | Charon, Puck, Fenrir, Orus, Achird, Algenib, Enceladus |\r
\r
---\r
\r
## Prompt Engineering for Natural Speech\r
\r
### 1. Pause Tags\r
Converted automatically into SSML `\x3Cbreak>` tags:\r
\r
| Tag | Duration |\r
| :--- | :--- |\r
| `[pause short]` | 300ms |\r
| `[pause]` | 600ms |\r
| `[pause long]` | 900ms |\r
\r
### 2. Human-Like Formatting\r
- **Contractions**: Use "I'm", "don't", "can't" for a conversational tone.\r
- **Ellipses**: Use `...` for trailing hesitation.\r
- **Fillers**: Use "Well,", "Um,", or "So," to mimic natural thought.\r
\r
---\r
\r
## Authentication\r
\r
Uses **Google Application Default Credentials (ADC)**. One-time setup:\r
\r
```bash\r
gcloud auth application-default login\r
```\r
\r
---\r
\r
## Requirements\r
\r
| Requirement | Status |\r
| :--- | :--- |\r
| Node.js 18+ | ❌ Must be installed on system |\r
| OPENCLAW_WORKSPACE | ℹ️ Optional (Defaults to current dir) |\r
| `@google-cloud/text-to-speech` | ✅ Auto-installed locally in skill folder |\r
| Google Cloud SDK + ADC login | ❌ One-time manual step required |\r
\r
---\r
\r
## Workflow\r
\r
1. **Detect Intent** — Identify a TTS trigger keyword.\r
2. **Format Text** — Apply contractions, ellipses, and `[pause]` tags.\r
3. **Check Environment** — Confirm `node` (18+) is available and `OPENCLAW_WORKSPACE` is known.\r
4. **Auto-install deps** — Run Step 3 if `node_modules` is missing.\r
5. **Execute** — `node "$SKILL_DIR/gtts.js" --text "..." --voice "..." --out "..."`\r
6. **Confirm Output** — Reference the `SUCCESS:` path and confirm to the user.\r

Usage Guidance

This skill appears to do what it says: synthesize speech via Google Cloud. Before installing, be aware that: (1) you must have Node.js 18+ and run `gcloud auth application-default login` (this grants the skill access to your ADC credentials and any allowed Google Cloud projects); (2) the skill will run `npm install` in its folder to fetch @google-cloud/text-to-speech — consider running this in an isolated environment or verifying the package/version to reduce supply-chain risk; (3) the skill will write node_modules and generated audio files into its folder or your OPENCLAW_WORKSPACE; (4) if you are concerned about automatic invocation, note the agent will run the skill when it detects TTS triggers — restrict invocation or review triggers if needed. If you want extra assurance, inspect the installed package version and run the skill in a controlled/test environment first.

Capability Analysis

Type: OpenClaw Skill Name: g-tts Version: 1.0.2 The g-tts skill bundle is a legitimate implementation for Google Cloud Chirp 3 HD text-to-speech synthesis. It uses standard Google Cloud SDK libraries and follows a transparent process for dependency management (local npm install) and authentication (Google ADC). No signs of data exfiltration, malicious execution, or harmful prompt injection were found in SKILL.md or gtts.js.

Capability Assessment

✓ Purpose & Capability

The skill name/description (Google Chirp 3 HD TTS) match the declared needs: Node.js 18+, Google Application Default Credentials, network access to call Google Cloud and to run npm. No unrelated secrets, binaries, or surprising capabilities are requested.

✓ Instruction Scope

SKILL.md instructs the agent to detect TTS triggers, ensure Node.js is present, run a local npm install if missing, and execute the bundled Node script. The instructions only reference the skill folder and an optional OPENCLAW_WORKSPACE; they do not read unrelated system files or send data to unexpected endpoints.

ℹ Install Mechanism

There is no packaged install spec, but the runtime instructions perform a local `npm install @google-cloud/text-to-speech --prefix <skill_dir>` from the public npm registry. This is expected for this use case but carries ordinary npm supply-chain risk (package compromise, typosquatting). The install is local to the skill folder and uses no obscure URLs.

✓ Credentials

The only external credential workflow documented is Google ADC (`gcloud auth application-default login`), which is appropriate and required for calling Google Cloud TTS. No unrelated environment variables or credentials are requested; OPENCLAW_WORKSPACE is optional and used only to place output files.

✓ Persistence & Privilege

The skill does not request always:true, does not modify other skills or system-wide agent settings, and only writes node_modules locally plus generated audio to the workspace. This level of persistence is proportional to its purpose.

Version History

v1.0.2

- Removed legacy dependency management files: scripts/package.json and scripts/package-lock.json. - Simplified project by eliminating unneeded Node.js package files from the scripts directory. - No user-facing functionality changes.

v1.0.1

- Added explicit Node.js 18+ requirement and version check. - Introduced OPENCLAW_WORKSPACE environment variable for output file path; falls back to current directory if unset. - Security note added about local-only npm install on first execution. - Requirements and environment variables are now stated in structured lists. - Minor clarifications and safety improvements in auto-run logic and documentation.

v1.0.0

- Initial release of g-tts skill leveraging Google Cloud Chirp 3 HD voices for high-definition, expressive text-to-speech synthesis. - Detects TTS triggers and processes requests using Node.js with automated local dependency installation. - Supports natural speech formatting with contractions, ellipses, fillers, and custom pause tags. - Allows easy voice selection from a curated list of male and female HD voices. - Integrates Google Application Default Credentials (ADC) authentication, with clear setup instructions.

Metadata

Slug g-tts

Version 1.0.2

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 3

Frequently Asked Questions

What is Google Chirp 3 HD TTS Skill?

High-definition generative speech synthesis using Google Cloud Chirp 3 HD voices. Delivers superior realism, emotional expressiveness, and natural pacing usi... It is an AI Agent Skill for Claude Code / OpenClaw, with 109 downloads so far.

How do I install Google Chirp 3 HD TTS Skill?

Run "/install g-tts" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Google Chirp 3 HD TTS Skill free?

Yes, Google Chirp 3 HD TTS Skill is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Google Chirp 3 HD TTS Skill support?

Google Chirp 3 HD TTS Skill is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Google Chirp 3 HD TTS Skill?

It is built and maintained by jarar21 (@jarar21); the current version is v1.0.2.

More Skills

Google Chirp 3 HD TTS Skill