← 返回 Skills 市场

Azure Speech Tts

Name: Azure Speech Tts
Author: conanwhf

作者 conanwhf · GitHub ↗ · v1.0.2 · MIT-0

cross-platform ✓ 安全检测通过

165

总下载

当前安装

版本数

在 OpenClaw 中安装

/install azure-speech-tts

功能描述

Azure Speech TTS skill for generating local audio files from text or SSML with Azure Speech. Use when the user asks to use Azure Speech / Azure TTS / Microso...

使用说明 (SKILL.md)

Azure Speech TTS

Use Azure Speech to turn text or SSML into a local audio file under download/.

What this skill does

Synthesize plain text into speech
Synthesize full SSML payloads directly
Choose voice, output format, rate, pitch, style, and role
Save the result as a local audio file and print a JSON summary

Configuration

This skill uses a small default config file plus environment variables.

Default config file

File:

config.json

Default values:

default_voice: zh-CN-Yunqi:DragonHDOmniLatestNeural
default_format: mp3
default_output_dir: download
default_timeout_seconds: 60

Secret values

Set these in the local shell environment:

AZURE_SPEECH_KEY
AZURE_SPEECH_REGION

Optional environment overrides

AZURE_SPEECH_VOICE
AZURE_SPEECH_FORMAT

Precedence

Use this order:

CLI flag
Environment variable
config.json
Built-in fallback

Quick start

python3 scripts/azure_tts.py \
  --text "你好，这是一段测试语音。" \
  --voice zh-CN-Yunqi:DragonHDOmniLatestNeural \
  --format mp3 \
  --output download/test.mp3

For SSML:

python3 scripts/azure_tts.py \
  --ssml-file temp/input.ssml \
  --format wav \
  --output download/test.wav

Workflow

Decide whether the input is plain text or full SSML.
Use --text / --text-file for normal narration.
Use --ssml / --ssml-file only when the payload already contains a complete \x3Cspeak> document.
Pick the voice and output format, or let config.json supply the defaults.
Run scripts/azure_tts.py.
Return the generated audio path to the user.

Rules

Prefer plain text unless the user needs pauses, emphasis, multi-voice content, or expressive styling.
--ssml input must include a full \x3Cspeak> root element.
Default voice is zh-CN-Yunqi:DragonHDOmniLatestNeural if nothing else is set.
Default output folder is download/.
If the user does not specify format, use the default MP3 output.
Do not put secrets in config.json.

Common formats

See references/azure-speech-cheatsheet.md for the format map and examples.

Short aliases supported by the script:

mp3
wav
pcm
ogg

Useful options

--voice: Azure voice name, for example en-US-AriaNeural
--language: SSML xml:lang for plain-text mode
--rate: speaking rate, for example +10%
--pitch: pitch adjustment, for example +2st
--style: expressive style such as cheerful, sad, chat
--style-degree: strength of the expressive style
--role: voice role when supported
--save-ssml: write the generated SSML to a file for inspection
--dry-run: print the generated SSML without calling Azure

Output

The helper script writes the audio file and prints JSON like:

{
  "ok": true,
  "output_path": "download/test.mp3",
  "format": "audio-24khz-48kbitrate-mono-mp3",
  "voice": "zh-CN-Yunqi:DragonHDOmniLatestNeural",
  "language": "zh-CN",
  "bytes": 123456
}

Use the printed output_path as the deliverable path.

安全使用建议

This skill appears coherent and limited to Azure TTS. Before installing, (1) provide a dedicated Azure Speech key/region with minimal privileges, (2) do not store secrets in config.json (keep them in environment variables as instructed), (3) review or sandbox the included Python script if you plan to run it on sensitive hosts, and (4) avoid feeding SSML that contains or references sensitive data or external URLs you don't trust. If you need higher assurance, run the script in an isolated environment and verify Azure endpoints and network traffic match expectations.

功能分析

Type: OpenClaw Skill Name: azure-speech-tts Version: 1.0.2 The azure-speech-tts skill is a legitimate implementation for converting text or SSML to audio using Microsoft Azure Speech services. The core logic in scripts/azure_tts.py uses standard Python libraries (urllib) to communicate exclusively with official Azure endpoints and includes proper XML escaping to prevent injection within the generated SSML. No evidence of malicious intent, data exfiltration, or suspicious obfuscation was found.

能力评估

✓ Purpose & Capability

Name and description match the included helper script and docs. The only secrets the skill asks for (AZURE_SPEECH_KEY and AZURE_SPEECH_REGION) are exactly what an Azure TTS client needs; requested files and paths (config.json, download/) align with generating local audio files.

✓ Instruction Scope

SKILL.md and the script limit actions to reading text/SSML (from CLI, files, or stdin), optionally writing generated SSML, calling Azure STS and TTS endpoints, and writing audio files plus a small JSON summary. There are no instructions to read unrelated system files or transmit arbitrary local data to third parties.

✓ Install Mechanism

No install spec is present (instruction-only install behavior). The repository includes a Python script that uses only stdlib urllib/argparse/json/file I/O. No downloads from untrusted URLs, no package manager pulls, and nothing that would execute arbitrary remote code during install.

✓ Credentials

Only AZURE_SPEECH_KEY and AZURE_SPEECH_REGION are used for operation (plus optional AZURE_SPEECH_VOICE/FORMAT). No unrelated secrets, keys, or config paths are requested. The SKILL.md correctly instructs not to put secrets in config.json.

✓ Persistence & Privilege

The skill is not always-enabled and does not request elevated or persistent platform privileges. It does not modify other skills or system-wide settings; it only writes outputs to the configured download/ folder and optionally the save-ssml path.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install azure-speech-tts
安装完成后，直接呼叫该 Skill 的名称或使用 /azure-speech-tts 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.2

- Added _meta.json metadata file to the repository. - No changes made to skill functionality or documentation content.

v1.0.1

Add README and polish the public skill docs for Azure Speech TTS.

v1.0.0

Initial public release: Azure Speech TTS skill with config defaults, env-based secrets, and text/SSML synthesis script.

元数据

Slug azure-speech-tts

版本 1.0.2

许可证 MIT-0

累计安装 1

当前安装数 1

历史版本数 3

常见问题

Azure Speech Tts 是什么？

Azure Speech TTS skill for generating local audio files from text or SSML with Azure Speech. Use when the user asks to use Azure Speech / Azure TTS / Microso... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 165 次。

如何安装 Azure Speech Tts？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install azure-speech-tts」即可一键安装，无需额外配置。

Azure Speech Tts 是免费的吗？

是的，Azure Speech Tts 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

Azure Speech Tts 支持哪些平台？

Azure Speech Tts 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Azure Speech Tts？

由 conanwhf（@conanwhf）开发并维护，当前版本 v1.0.2。