← 返回 Skills 市场
113
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install ai-voice-synthesis-claw
功能描述
AI智能配音合成专家。将文案/脚本转换为高拟真语音音频,支持多种音色、情感控制、SSML标注和后期处理。 触发场景:用户说"配音"、"语音合成"、"TTS"、"旁白"、"播客音频"、"有声读物"、"AI配音"、"朗读"、"音频生成", 或要求"用XX声音读这段文案"、"生成播客音频"、"把文章转成有声版"等。 支...
使用说明 (SKILL.md)
智能配音合成虾 (ai-voice-synthesis-claw)
将文字转化为有温度的声音。
工作流程
步骤 1:理解需求
收集以下信息(未提供时使用默认值):
- 文本内容:待配音的文案/脚本
- 音色风格:参考
references/voice-style-guide.md选择合适音色 - 语速:slow / normal(默认)/ fast
- 情感:calm / warm / professional / energetic
- 输出格式:mp3(默认)/ wav
步骤 2:文本预处理
在调用 TTS 前对文本进行处理:
- 分句断句(按标点符号)
- 数字转中文(100 → 一百)
- 多音字标注(如"重要"的"重")
- 添加停顿标记
步骤 3:选择 TTS 引擎
按优先级选择可用引擎:
- ElevenLabs(推荐):最自然,支持情感控制,需
ELEVENLABS_API_KEY - OpenAI TTS:质量稳定,需
OPENAI_API_KEY - Azure TTS:多语言支持,需
AZURE_SPEECH_KEY+AZURE_SPEECH_REGION - 系统 TTS(兜底):使用
tts工具直接合成(无需 API key,质量较低)
检查环境变量确认可用引擎:
echo "ElevenLabs: $ELEVENLABS_API_KEY" && echo "OpenAI: $OPENAI_API_KEY"
步骤 4:生成 SSML(可选,精细控制时使用)
参考 references/ssml-guide.md 为文本添加 SSML 标注。
简单场景可跳过,直接传纯文本。
步骤 5:调用合成脚本
# 单段文本合成
python3 scripts/synthesize-voice.py \
--text "你好,欢迎收听本期节目" \
--voice warm-female \
--speed normal \
--output ./output.mp3
# 从文件合成
python3 scripts/synthesize-voice.py \
--script ./script.txt \
--voice professional-male \
--speed fast \
--output ./output.mp3
# 添加背景音乐
python3 scripts/synthesize-voice.py \
--script ./script.txt \
--bgm ./bgm/light-jazz.mp3 \
--bgm-volume 0.1 \
--output ./output.mp3
步骤 6:后期处理
参考 references/audio-processing-guide.md,脚本自动完成:
- 降噪处理
- 音量标准化(-14 LUFS)
- 背景音乐混音(可选)
- 格式转换
步骤 7:交付
将生成的音频文件发送给用户:
合成完成!这是你的配音文件。
MEDIA:./output.mp3
音色快速参考
| 场景 | 推荐音色 |
|---|---|
| 知识科普 | professional-male / professional-female |
| 情感故事 | warm-female |
| 商业广告 | magnetic-male |
| 轻松娱乐 | young-energetic |
详细音色库见 references/voice-style-guide.md。
环境依赖
pip install elevenlabs openai pydub requests
brew install ffmpeg # macOS
注意事项
- 单次合成建议不超过 10 分钟音频
- 音色克隆需至少 1 分钟清晰样本音频
- 使用他人声音克隆需获得授权
- 无 API key 时降级使用系统
tts工具
安全使用建议
This skill appears to implement TTS via ElevenLabs and OpenAI, but there are a few red flags you should consider before installing or supplying API keys:
- Metadata vs. reality: The registry metadata lists no required environment variables, but the included script requires ELEVENLABS_API_KEY and OPENAI_API_KEY. Confirm with the author or expect to provide those keys.
- Azure mismatch: SKILL.md mentions Azure credentials, but the script does not implement Azure TTS — ask the maintainer for clarification if you need Azure support.
- Secret exposure: SKILL.md shows examples that echo environment variables (e.g., echo "ElevenLabs: $ELEVENLABS_API_KEY"). Avoid executing such commands in shared or logged environments since they may expose your API keys in logs. Instead, verify keys privately or use secure tooling to manage secrets.
- Dependency installation: The instructions tell you to pip install packages and brew install ffmpeg. Only install these in a trusted/isolated environment (virtualenv/container) to limit risk.
- Voice cloning / copyright: The skill notes voice cloning requires authorization. Do not pass audio samples or use someone else's voice without consent.
Suggested actions before use: inspect the code (you already have synthesize-voice.py), run it in an isolated environment, provide API keys with least-privilege credentials or test keys, and request the publisher update the skill metadata to list required env vars and clarify Azure support.
功能分析
Type: OpenClaw Skill
Name: ai-voice-synthesis-claw
Version: 1.0.0
The skill bundle contains an instruction in SKILL.md (Step 3) that directs the AI agent to execute a shell command to 'echo' sensitive environment variables (ELEVENLABS_API_KEY and OPENAI_API_KEY). This behavior risks exposing private API credentials in the agent's output logs or to the end-user. While the Python script 'scripts/synthesize-voice.py' appears to be a legitimate implementation of voice synthesis using ElevenLabs and OpenAI APIs, the explicit instruction to print secrets is a high-risk vulnerability often used for credential harvesting.
能力标签
能力评估
Purpose & Capability
The skill's stated purpose is text→TTS using ElevenLabs/OpenAI/Azure/system TTS, which matches the included synthesize-voice.py for ElevenLabs and OpenAI; however the registry metadata declares no required env vars or credentials while both SKILL.md and the script expect ELEVENLABS_API_KEY and OPENAI_API_KEY (SKILL.md also lists AZURE_SPEECH_KEY and region but the script does not implement Azure). This mismatch between claimed requirements and actual code is incoherent.
Instruction Scope
SKILL.md provides a clear TTS workflow and example commands that invoke scripts/synthesize-voice.py and post-processing. However the docs demonstrate running echo "ElevenLabs: $ELEVENLABS_API_KEY" which would print API keys to stdout/logs (a potential secret-leak risk). The instructions ask the agent to read script files and write output audio files (expected), and there are no instructions to exfiltrate data to unexpected endpoints. The guide suggests installing dependencies via pip/brew but there is no install spec in the metadata.
Install Mechanism
There is no automated install spec (instruction-only plus a Python script). That is the lower-risk model because nothing is automatically downloaded or executed during install. The SKILL.md suggests pip/brew commands for dependencies, which is expected for a Python-based TTS script but will run arbitrary package installs if followed by a user.
Credentials
The package metadata declares no required environment variables, but the script reads ELEVENLABS_API_KEY and OPENAI_API_KEY from the environment and SKILL.md also references AZURE_SPEECH_* keys. Requiring API keys for the listed TTS services is reasonable, but the omission from metadata is inconsistent and the SKILL.md example of echoing env vars risks exposing secrets. There are no other unnecessary credentials requested.
Persistence & Privilege
The skill does not request privileged persistence (always:false) and does not modify other skills or system-wide configs. It only runs as a normal user CLI script and writes generated audio files to the working directory.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install ai-voice-synthesis-claw - 安装完成后,直接呼叫该 Skill 的名称或使用
/ai-voice-synthesis-claw触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
初始发布:支持 ElevenLabs/OpenAI TTS 多引擎配音,含音色库、SSML规范、后期处理指南
元数据
常见问题
智能配音合成虾 是什么?
AI智能配音合成专家。将文案/脚本转换为高拟真语音音频,支持多种音色、情感控制、SSML标注和后期处理。 触发场景:用户说"配音"、"语音合成"、"TTS"、"旁白"、"播客音频"、"有声读物"、"AI配音"、"朗读"、"音频生成", 或要求"用XX声音读这段文案"、"生成播客音频"、"把文章转成有声版"等。 支... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 113 次。
如何安装 智能配音合成虾?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install ai-voice-synthesis-claw」即可一键安装,无需额外配置。
智能配音合成虾 是免费的吗?
是的,智能配音合成虾 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
智能配音合成虾 支持哪些平台?
智能配音合成虾 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 智能配音合成虾?
由 Ricky(@tujinsama)开发并维护,当前版本 v1.0.0。
推荐 Skills