← Back to Skills Marketplace

Coze Tts

Name: Coze Tts
Author: franklu0819-lang

by xiaofei · GitHub ↗ · v1.0.3 · MIT-0

cross-platform ✓ Security Clean

259

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install coze-tts

Description

Text-to-Speech (TTS) using Coze API. Convert text to natural-sounding speech audio files. Supports multiple voices and output formats (mp3, ogg_opus, wav, pcm).

README (SKILL.md)

Coze Text-to-Speech (TTS)

Convert text to natural-sounding speech using Coze API.

Setup

1. Get your API Key: Get a key from Coze Platform

2. Set it in your environment:

export COZE_API_KEY="your-key-here"

Supported Output Formats

MP3 - Default format, widely compatible
OGG_OPUS - Optimized for streaming and messaging
WAV - Uncompressed audio
PCM - Raw audio data

Usage

Basic TTS

Convert text to speech with default settings:

bash scripts/text_to_speech.sh "你好，这是测试语音"

Save to Specific File

bash scripts/text_to_speech.sh "你好世界" -o output.mp3

Use Different Voice

bash scripts/text_to_speech.sh "你好" -v 2

Change Output Format

bash scripts/text_to_speech.sh "你好" -f ogg_opus

Full Options

bash scripts/text_to_speech.sh "要转换的文本" -o output.mp3 -v 1 -f mp3

Parameters:

text (required): Text to convert to speech
-o, --output (optional): Output file path (default: auto-generated)
-v, --voice (optional): Voice ID (default: 1)
-f, --format (optional): Output format - mp3/ogg_opus/wav/pcm (default: mp3)

Output

The script saves the audio file and outputs:

File path
File size
Audio duration (if ffprobe is available)

Example output:

✓ Audio saved: coze_tts_20260324_235030_a1b2c3d4.mp3
  Size: 25.3 KB
  Duration: ~3 seconds

Workflow Examples

Generate Notification Audio

bash scripts/text_to_speech.sh "您有一条新消息" -o notification.mp3

Create Voice Greeting

bash scripts/text_to_speech.sh "欢迎使用 Coze 语音服务" -v 2 -o greeting.mp3

Generate OGG for Messaging

bash scripts/text_to_speech.sh "你好" -f ogg_opus -o message.ogg

Batch Generate

for text in "你好" "谢谢" "再见"; do
    bash scripts/text_to_speech.sh "$text" -o "${text}.mp3"
done

Integration with Other Skills

Combine with coze-asr for voice conversation:

# 1. User speaks -> ASR converts to text
bash coze-asr/scripts/speech_to_text.sh input.ogg

# 2. Process text with AI...

# 3. AI response -> TTS converts to speech
bash coze-tts/scripts/text_to_speech.sh "AI的回复" -o response.mp3

Troubleshooting

Authentication Error:

Check COZE_API_KEY is set correctly
Verify API key has TTS permissions

Invalid Voice ID:

Voice ID should be a number (int64 format)
Try voice_id: 1 as default

File Not Created:

Check write permissions in output directory
Ensure sufficient disk space

Limitations

Text length limits apply (check Coze documentation)
Rate limits may apply based on your plan
Some voices may not support all output formats

API Reference

Endpoint: POST https://api.coze.cn/v1/audio/speech
Authentication: Bearer token (COZE_API_KEY)
Content-Type: application/json

Required Environment Variables

Variable	Description	Required
`COZE_API_KEY`	Coze API authentication key	Yes

Required Tools

Tool	Purpose	Required
`jq`	JSON processing	Yes
`ffprobe`	Audio duration detection	Optional

License

MIT

Usage Guidance

This skill is coherent with its TTS purpose, but review before installing: (1) Confirm the API endpoint (https://api.coze.cn) and that you trust Coze and your API key—the script will send the text you provide to that external service. (2) Note the mismatch between documented default voice (1) and the script's VOICE_ID=6—test and adjust the default if needed. (3) The metadata/version in _meta.json differs from registry metadata; this is likely packaging drift but worth noting. (4) The skill declares jq as required but the script also expects curl, md5sum, stat, bc (and optionally ffprobe); ensure those tools exist on your system. (5) Limit the scope of the COZE_API_KEY (use least privilege / appropriate plan) and do not expose it publicly. If any of these points worry you or you need the script to behave differently, inspect or modify the script locally before use.

Capability Analysis

Type: OpenClaw Skill Name: coze-tts Version: 1.0.3 The skill is a legitimate Text-to-Speech utility that interfaces with the official Coze API (api.coze.cn). The primary script, scripts/text_to_speech.sh, uses standard tools like curl and jq to send user-provided text to the API and save the resulting audio file, with no evidence of data exfiltration, malicious execution, or prompt injection.

Capability Assessment

ℹ Purpose & Capability

Name/description align with the files and behavior: the script posts text to https://api.coze.cn/v1/audio/speech and saves audio. However there are minor inconsistencies: SKILL.md and references state default voice_id is 1, while the script sets VOICE_ID=6 and help text claims default 1. _meta.json version (1.0.2) differs from registry metadata (1.0.3). These look like packaging/documentation drift, not maliciousness.

✓ Instruction Scope

SKILL.md instructs running the included shell script and only documents use of COZE_API_KEY and jq; the script's runtime actions are confined to building JSON, calling the documented Coze API endpoint, writing an audio file locally, and optionally using ffprobe. It does not attempt to read unrelated system files or other env vars.

✓ Install Mechanism

This is an instruction-only skill with a shipped shell script and no install spec or remote downloads. Nothing is pulled from arbitrary URLs or executed during install.

ℹ Credentials

The only required env var is COZE_API_KEY which is appropriate for calling the Coze service. One minor proportionality issue: required binaries lists only jq, but the script also uses common utilities (curl, md5sum, stat, bc, date, ffprobe optional). These are typical but should be documented explicitly.

✓ Persistence & Privilege

The skill does not request elevated or persistent platform privileges (always:false). It does not modify other skills or system-wide settings.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install coze-tts
After installation, invoke the skill by name or use /coze-tts
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.3

默认音色改为 6

v1.0.2

修复脚本语法错误，添加API参考文档

v1.0.1

优化：移除未声明的飞书集成脚本，更新文档以准确反映功能

v1.0.0

首次发布：Coze 语音合成技能，支持 mp3/ogg_opus/wav/pcm 格式，支持语速调整

Metadata

Slug coze-tts

Version 1.0.3

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 4

Frequently Asked Questions

What is Coze Tts?

Text-to-Speech (TTS) using Coze API. Convert text to natural-sounding speech audio files. Supports multiple voices and output formats (mp3, ogg_opus, wav, pcm). It is an AI Agent Skill for Claude Code / OpenClaw, with 259 downloads so far.

How do I install Coze Tts?

Run "/install coze-tts" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Coze Tts free?

Yes, Coze Tts is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Coze Tts support?

Coze Tts is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Coze Tts?

It is built and maintained by xiaofei (@franklu0819-lang); the current version is v1.0.3.

More Skills