← Back to Skills Marketplace

豆包语音合成 2.0

Name: 豆包语音合成 2.0
Author: yanmomuyu-sys

by yanmomuyu-sys · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ⚠ suspicious

111

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install seedtts2

Description

豆包语音合成 2.0，支持情绪控制、多音色、语音指令。34 种音色可选，含 JARVIS 同款男声。

Usage Guidance

Things to check before installing/use: - The skill's registry metadata did not list required environment variables, but both SKILL.md and the shipped Python code require VOLCANO_APP_ID and VOLCANO_ACCESS_TOKEN. Don't provide credentials until you confirm this is the intended provider and you trust the package source. - Verify the API endpoint and token format with the official 火山引擎 (Volcano/ByteDance) docs. The code and docs use an unusual 'Bearer; {token}' header format (semicolon + space) — confirm that's required and not a typo that will prompt manual token hacks. - Inspect the included tts_client.py locally (it is provided) before running. The code performs network calls to https://openspeech.bytedance.com and writes audio files under ~/.openclaw/tts_output and reads ~/.openclaw/openclaw.json; ensure you are comfortable with that file access. - Playback uses subprocess.run with system binaries (afplay/aplay/Windows start). Running these is typical but ensure your environment has the expected players; subprocess usage appears to use list form (no shell injection), but you should still inspect the tail of the file (the provided tts_client.py was partially truncated in the scan) to confirm no unexpected shell=True or unescaped concatenation. - If you want least privilege: create a dedicated minimal-volume API key on the provider side (if supported) and test in a sandboxed account before using production credentials. Given the above inconsistencies and the need to handle credentials, proceed with caution and verify the token/header format and the skill's origin before granting credentials or installing into a production agent.

Capability Analysis

Type: OpenClaw Skill Name: seedtts2 Version: 1.0.0 The skill provides a functional Python client and CLI for Volcengine's SeedTTS 2.0 API. While the logic is aligned with its stated purpose, tts_client.py contains a shell injection vulnerability in the say_and_play method; on Windows, it uses subprocess.run with shell=True on the output file path, which could allow arbitrary command execution if a user provides a crafted filename. No evidence of intentional malice or unauthorized data exfiltration was found, as API credentials are only sent to the official ByteDance endpoint (openspeech.bytedance.com).

Capability Assessment

ℹ Purpose & Capability

The name, description, SKILL.md, docs, and code all describe a TTS client calling a 火山引擎/openspeech.bytedance.com API — requiring APP_ID and Access Token is consistent with that purpose. However the skill registry metadata lists no required environment variables or primary credential, which is inconsistent with the shipped code and documentation that both require VOLCANO_APP_ID and VOLCANO_ACCESS_TOKEN.

✓ Instruction Scope

Runtime instructions and code are limited to: reading env vars and an OpenClaw JSON config, posting JSON to the documented TTS API endpoint, parsing streamed responses, saving audio files under ~/.openclaw/tts_output, and optionally invoking local players (afplay/aplay/Windows start). The skill does not instruct collection of unrelated system data or sending data to unexpected external endpoints beyond openspeech.bytedance.com. Note: SKILL.md and docs repeatedly recommend placing credentials in ~/.openclaw/openclaw.json which the client will read.

✓ Install Mechanism

There is no install spec (instruction-only skill), and included files are pure Python with no external arbitrary downloads. This is low risk from an installation mechanism perspective.

⚠ Credentials

The code and SKILL.md require VOLCANO_APP_ID and VOLCANO_ACCESS_TOKEN (and optionally VOLCANO_RESOURCE_ID / VOLCANO_API_URL), which are proportional to a TTS client. However the skill registry metadata declared no required env vars or primary credential — an incoherence that may mislead users. Also the docs and code use an unusual token header format 'Authorization: Bearer; {token}' (semicolon + space) — this is nonstandard and should be verified against the provider to ensure it is intentional and not a bug that might cause manual token workarounds.

✓ Persistence & Privilege

The skill does write output to ~/.openclaw/tts_output and will read ~/.openclaw/openclaw.json; this is reasonable for a client that stores outputs and reads local configuration. The skill is not marked always:true and does not request system-wide persistent privileges beyond its own config and output directory.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install seedtts2
After installation, invoke the skill by name or use /seedtts2
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

初始版本 - 豆包 SeedTTS 2.0，支持 34 种音色和情绪控制，优化了结束标记检测/Session 复用/超时配置

Metadata

Slug seedtts2

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is 豆包语音合成 2.0?

豆包语音合成 2.0，支持情绪控制、多音色、语音指令。34 种音色可选，含 JARVIS 同款男声。 It is an AI Agent Skill for Claude Code / OpenClaw, with 111 downloads so far.

How do I install 豆包语音合成 2.0?

Run "/install seedtts2" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is 豆包语音合成 2.0 free?

Yes, 豆包语音合成 2.0 is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does 豆包语音合成 2.0 support?

豆包语音合成 2.0 is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created 豆包语音合成 2.0?

It is built and maintained by yanmomuyu-sys (@yanmomuyu-sys); the current version is v1.0.0.

More Skills