← Back to Skills Marketplace
franklu0819-lang

Coze Tts

by xiaofei · GitHub ↗ · v1.0.3 · MIT-0
cross-platform ✓ Security Clean
259
Downloads
0
Stars
0
Active Installs
4
Versions
Install in OpenClaw
/install coze-tts
Description
Text-to-Speech (TTS) using Coze API. Convert text to natural-sounding speech audio files. Supports multiple voices and output formats (mp3, ogg_opus, wav, pcm).
README (SKILL.md)

Coze Text-to-Speech (TTS)

Convert text to natural-sounding speech using Coze API.

Setup

1. Get your API Key: Get a key from Coze Platform

2. Set it in your environment:

export COZE_API_KEY="your-key-here"

Supported Output Formats

  • MP3 - Default format, widely compatible
  • OGG_OPUS - Optimized for streaming and messaging
  • WAV - Uncompressed audio
  • PCM - Raw audio data

Usage

Basic TTS

Convert text to speech with default settings:

bash scripts/text_to_speech.sh "你好,这是测试语音"

Save to Specific File

bash scripts/text_to_speech.sh "你好世界" -o output.mp3

Use Different Voice

bash scripts/text_to_speech.sh "你好" -v 2

Change Output Format

bash scripts/text_to_speech.sh "你好" -f ogg_opus

Full Options

bash scripts/text_to_speech.sh "要转换的文本" -o output.mp3 -v 1 -f mp3

Parameters:

  • text (required): Text to convert to speech
  • -o, --output (optional): Output file path (default: auto-generated)
  • -v, --voice (optional): Voice ID (default: 1)
  • -f, --format (optional): Output format - mp3/ogg_opus/wav/pcm (default: mp3)

Output

The script saves the audio file and outputs:

  • File path
  • File size
  • Audio duration (if ffprobe is available)

Example output:

✓ Audio saved: coze_tts_20260324_235030_a1b2c3d4.mp3
  Size: 25.3 KB
  Duration: ~3 seconds

Workflow Examples

Generate Notification Audio

bash scripts/text_to_speech.sh "您有一条新消息" -o notification.mp3

Create Voice Greeting

bash scripts/text_to_speech.sh "欢迎使用 Coze 语音服务" -v 2 -o greeting.mp3

Generate OGG for Messaging

bash scripts/text_to_speech.sh "你好" -f ogg_opus -o message.ogg

Batch Generate

for text in "你好" "谢谢" "再见"; do
    bash scripts/text_to_speech.sh "$text" -o "${text}.mp3"
done

Integration with Other Skills

Combine with coze-asr for voice conversation:

# 1. User speaks -> ASR converts to text
bash coze-asr/scripts/speech_to_text.sh input.ogg

# 2. Process text with AI...

# 3. AI response -> TTS converts to speech
bash coze-tts/scripts/text_to_speech.sh "AI的回复" -o response.mp3

Troubleshooting

Authentication Error:

  • Check COZE_API_KEY is set correctly
  • Verify API key has TTS permissions

Invalid Voice ID:

  • Voice ID should be a number (int64 format)
  • Try voice_id: 1 as default

File Not Created:

  • Check write permissions in output directory
  • Ensure sufficient disk space

Limitations

  • Text length limits apply (check Coze documentation)
  • Rate limits may apply based on your plan
  • Some voices may not support all output formats

API Reference

  • Endpoint: POST https://api.coze.cn/v1/audio/speech
  • Authentication: Bearer token (COZE_API_KEY)
  • Content-Type: application/json

Required Environment Variables

Variable Description Required
COZE_API_KEY Coze API authentication key Yes

Required Tools

Tool Purpose Required
jq JSON processing Yes
ffprobe Audio duration detection Optional

License

MIT

Usage Guidance
This skill is coherent with its TTS purpose, but review before installing: (1) Confirm the API endpoint (https://api.coze.cn) and that you trust Coze and your API key—the script will send the text you provide to that external service. (2) Note the mismatch between documented default voice (1) and the script's VOICE_ID=6—test and adjust the default if needed. (3) The metadata/version in _meta.json differs from registry metadata; this is likely packaging drift but worth noting. (4) The skill declares jq as required but the script also expects curl, md5sum, stat, bc (and optionally ffprobe); ensure those tools exist on your system. (5) Limit the scope of the COZE_API_KEY (use least privilege / appropriate plan) and do not expose it publicly. If any of these points worry you or you need the script to behave differently, inspect or modify the script locally before use.
Capability Analysis
Type: OpenClaw Skill Name: coze-tts Version: 1.0.3 The skill is a legitimate Text-to-Speech utility that interfaces with the official Coze API (api.coze.cn). The primary script, scripts/text_to_speech.sh, uses standard tools like curl and jq to send user-provided text to the API and save the resulting audio file, with no evidence of data exfiltration, malicious execution, or prompt injection.
Capability Assessment
Purpose & Capability
Name/description align with the files and behavior: the script posts text to https://api.coze.cn/v1/audio/speech and saves audio. However there are minor inconsistencies: SKILL.md and references state default voice_id is 1, while the script sets VOICE_ID=6 and help text claims default 1. _meta.json version (1.0.2) differs from registry metadata (1.0.3). These look like packaging/documentation drift, not maliciousness.
Instruction Scope
SKILL.md instructs running the included shell script and only documents use of COZE_API_KEY and jq; the script's runtime actions are confined to building JSON, calling the documented Coze API endpoint, writing an audio file locally, and optionally using ffprobe. It does not attempt to read unrelated system files or other env vars.
Install Mechanism
This is an instruction-only skill with a shipped shell script and no install spec or remote downloads. Nothing is pulled from arbitrary URLs or executed during install.
Credentials
The only required env var is COZE_API_KEY which is appropriate for calling the Coze service. One minor proportionality issue: required binaries lists only jq, but the script also uses common utilities (curl, md5sum, stat, bc, date, ffprobe optional). These are typical but should be documented explicitly.
Persistence & Privilege
The skill does not request elevated or persistent platform privileges (always:false). It does not modify other skills or system-wide settings.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install coze-tts
  3. After installation, invoke the skill by name or use /coze-tts
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.3
默认音色改为 6
v1.0.2
修复脚本语法错误,添加API参考文档
v1.0.1
优化:移除未声明的飞书集成脚本,更新文档以准确反映功能
v1.0.0
首次发布:Coze 语音合成技能,支持 mp3/ogg_opus/wav/pcm 格式,支持语速调整
Metadata
Slug coze-tts
Version 1.0.3
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 4
Frequently Asked Questions

What is Coze Tts?

Text-to-Speech (TTS) using Coze API. Convert text to natural-sounding speech audio files. Supports multiple voices and output formats (mp3, ogg_opus, wav, pcm). It is an AI Agent Skill for Claude Code / OpenClaw, with 259 downloads so far.

How do I install Coze Tts?

Run "/install coze-tts" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Coze Tts free?

Yes, Coze Tts is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Coze Tts support?

Coze Tts is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Coze Tts?

It is built and maintained by xiaofei (@franklu0819-lang); the current version is v1.0.3.

💬 Comments