/install azure-speech-tts
Azure Speech TTS
Use Azure Speech to turn text or SSML into a local audio file under download/.
What this skill does
- Synthesize plain text into speech
- Synthesize full SSML payloads directly
- Choose voice, output format, rate, pitch, style, and role
- Save the result as a local audio file and print a JSON summary
Configuration
This skill uses a small default config file plus environment variables.
Default config file
File:
config.json
Default values:
default_voice:zh-CN-Yunqi:DragonHDOmniLatestNeuraldefault_format:mp3default_output_dir:downloaddefault_timeout_seconds:60
Secret values
Set these in the local shell environment:
AZURE_SPEECH_KEYAZURE_SPEECH_REGION
Optional environment overrides
AZURE_SPEECH_VOICEAZURE_SPEECH_FORMAT
Precedence
Use this order:
- CLI flag
- Environment variable
config.json- Built-in fallback
Quick start
python3 scripts/azure_tts.py \
--text "你好,这是一段测试语音。" \
--voice zh-CN-Yunqi:DragonHDOmniLatestNeural \
--format mp3 \
--output download/test.mp3
For SSML:
python3 scripts/azure_tts.py \
--ssml-file temp/input.ssml \
--format wav \
--output download/test.wav
Workflow
- Decide whether the input is plain text or full SSML.
- Use
--text/--text-filefor normal narration. - Use
--ssml/--ssml-fileonly when the payload already contains a complete\x3Cspeak>document. - Pick the voice and output format, or let
config.jsonsupply the defaults. - Run
scripts/azure_tts.py. - Return the generated audio path to the user.
Rules
- Prefer plain text unless the user needs pauses, emphasis, multi-voice content, or expressive styling.
--ssmlinput must include a full\x3Cspeak>root element.- Default voice is
zh-CN-Yunqi:DragonHDOmniLatestNeuralif nothing else is set. - Default output folder is
download/. - If the user does not specify format, use the default MP3 output.
- Do not put secrets in
config.json.
Common formats
See references/azure-speech-cheatsheet.md for the format map and examples.
Short aliases supported by the script:
mp3wavpcmogg
Useful options
--voice: Azure voice name, for exampleen-US-AriaNeural--language: SSMLxml:langfor plain-text mode--rate: speaking rate, for example+10%--pitch: pitch adjustment, for example+2st--style: expressive style such ascheerful,sad,chat--style-degree: strength of the expressive style--role: voice role when supported--save-ssml: write the generated SSML to a file for inspection--dry-run: print the generated SSML without calling Azure
Output
The helper script writes the audio file and prints JSON like:
{
"ok": true,
"output_path": "download/test.mp3",
"format": "audio-24khz-48kbitrate-mono-mp3",
"voice": "zh-CN-Yunqi:DragonHDOmniLatestNeural",
"language": "zh-CN",
"bytes": 123456
}
Use the printed output_path as the deliverable path.
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install azure-speech-tts - 安装完成后,直接呼叫该 Skill 的名称或使用
/azure-speech-tts触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
Azure Speech Tts 是什么?
Azure Speech TTS skill for generating local audio files from text or SSML with Azure Speech. Use when the user asks to use Azure Speech / Azure TTS / Microso... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 165 次。
如何安装 Azure Speech Tts?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install azure-speech-tts」即可一键安装,无需额外配置。
Azure Speech Tts 是免费的吗?
是的,Azure Speech Tts 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Azure Speech Tts 支持哪些平台?
Azure Speech Tts 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Azure Speech Tts?
由 conanwhf(@conanwhf)开发并维护,当前版本 v1.0.2。