← 返回 Skills 市场
allenfancy-gan

IMA AI Text To Speech — seed-tts, DouBao

作者 allenfancy-gan · GitHub ↗ · v1.0.8 · MIT-0
cross-platform ✓ 安全检测通过
630
总下载
0
收藏
0
当前安装
9
版本数
在 OpenClaw 中安装
/install ima-tts-ai
功能描述
Convert text, scripts, and captions into natural voiceovers for videos, explainers, product demos, and social posts.
使用说明 (SKILL.md)

IMA TTS AI — Text-to-Speech Generator

For complete API documentation, security details, all parameters, speaker list, and Python examples, read SKILL-DETAIL.md.

Model ID Reference (CRITICAL)

Friendly Name model_id Notes
Seed TTS 2.0 seed-tts-2.0 ✅ Default and only supported model

Sub-models (via extra-params):

  • seed-tts-2.0-expressive — More expressive, emotional (default)
  • seed-tts-2.0-standard — More stable, neutral

When User Says "帮我制作旁白/配音"

Must ask first:

Question Parameter Required
要朗读的内容/文案 prompt ✅ Yes

Recommend asking:

Question Parameter Options
音色/发音人 speaker 魅力苏菲、Vivi、云舟、大壹 等 (see SKILL-DETAIL.md)

Optional:

Question Parameter Range
情感/情绪 audio_params.emotion neutral, sad, angry
语速 audio_params.speech_rate [-50, 100], 0=normal
音量 audio_params.loudness_rate [-50, 100], 0=normal

User Input Parsing

User says Parameter Value
旁白/配音/朗读 prompt + speaker Ask for content first
女声/female speaker e.g. zh_female_vv_uranus_bigtts
男声/male speaker e.g. zh_male_sophie_uranus_bigtts
语速快/slow audio_params.speech_rate Positive/negative value
expressive/standard model Sub-model selection

Script Usage

# List available TTS models
python3 {baseDir}/scripts/ima_tts_create.py --api-key $IMA_API_KEY --list-models

# Generate speech (default model: seed-tts-2.0)
python3 {baseDir}/scripts/ima_tts_create.py \
  --api-key $IMA_API_KEY \
  --model-id seed-tts-2.0 \
  --prompt "Text to be spoken here." \
  --user-id {user_id} \
  --output-json

# With speaker and emotion
python3 {baseDir}/scripts/ima_tts_create.py \
  --api-key $IMA_API_KEY \
  --model-id seed-tts-2.0 \
  --prompt "阳光青年音色测试,你好世界。" \
  --extra-params '{"model":"seed-tts-2.0-expressive","speaker":"zh_male_sophie_uranus_bigtts","audio_params":{"emotion":"neutral"}}' \
  --user-id {user_id} \
  --output-json

Sending Results to User

# ✅ CORRECT: Use remote URL directly
message(action="send", media=audio_url, caption="✅ 语音合成成功!\
• 模型:[Name]\
• 耗时:[X]s\
• 积分:[N pts]\
\
🔗 原始链接:[url]")

# ❌ WRONG: Never download to local file

UX Protocol (Brief)

  1. Pre-generation: "🔊 开始语音合成… 模型:[Name],预计[X~Y]秒,消耗[N]积分"
  2. Progress: Every 10-15s: "⏳ 语音合成中… [P]%"
  3. Success: Send audio via media=audio_url + include link in caption
  4. Failure: Natural language error + suggest retry. See SKILL-DETAIL.md for error translation.

Never say to users: script names, API endpoints, attribute_id, technical parameter names.

Environment

Base URL: https://api.imastudio.com Headers: Authorization: Bearer $IMA_API_KEY · x-app-source: ima_skills · x_app_language: en

Core Flow

  1. GET /open/v1/product/list?app=ima&platform=web&category=text_to_speech → get attribute_id, credit
  2. POST /open/v1/tasks/create → get task_id
  3. POST /open/v1/tasks/detail → poll every 2-5s until resource_status==1

MANDATORY: Always query product list first. attribute_id is required.

Estimated Generation Time

Model Estimated Time Poll Every
seed-tts-2.0 5~30s 3s

User Preference Memory

Storage: ~/.openclaw/memory/ima_prefs.json

  • Save when user explicitly says "用XXX音色" / "默认用XXX"
  • Clear when user says "换个音色" / "推荐一个"

Popular Speakers (Quick Reference)

Category Speaker Name speaker ID
通用 魅力苏菲 zh_male_sophie_uranus_bigtts
通用 Vivi zh_female_vv_uranus_bigtts
通用 云舟 zh_male_m191_uranus_bigtts
视频配音 大壹 zh_male_dayi_uranus_bigtts
角色扮演 知性灿灿 zh_female_cancan_uranus_bigtts

Full speaker list: See volcengine_tts_timbre_list.json in project or SKILL-DETAIL.md.

⚠️ Important: Use native format (*_uranus_bigtts), NOT BV*_streaming format.

安全使用建议
This skill appears to do what it advertises: it sends your text to IMA Studio's TTS API using the IMA_API_KEY and returns an audio URL. Before installing: 1) Prefer a scoped or test API key if you're concerned about exposure. 2) Be aware the skill writes logs (~/.openclaw/logs/ima_skills/) and a preferences JSON (~/.openclaw/memory/ima_prefs.json); logs are auto-deleted after 7 days but may contain non-sensitive metadata (verify no prompts or secrets are logged if that matters). 3) Confirm you trust https://imastudio.com / api.imastudio.com as the destination for your API key. 4) The skill requires python3 and requests — install those in a controlled environment. If you want greater assurance, inspect the full ima_tts_create.py to verify it never logs the API key or writes user prompts to disk before using the skill.
功能分析
Type: OpenClaw Skill Name: ima-tts-ai Version: 1.0.8 The IMA TTS Generator skill is a legitimate tool for converting text to speech via the IMA Studio API (api.imastudio.com). The Python scripts (ima_tts_create.py and ima_logger.py) implement a standard API interaction flow: querying product lists, creating tasks, and polling for results. Sensitive data handling is limited to the IMA_API_KEY, which is sent only to the official endpoint. Local file operations are restricted to the ~/.openclaw directory for managing user preferences and logs (with a 7-day auto-cleanup policy). The SKILL.md instructions provide appropriate UX guidelines for the AI agent without any evidence of prompt injection or malicious intent.
能力评估
Purpose & Capability
Name/description (TTS) matches the required artifacts: a single IMA_API_KEY credential, python3 runtime, requests dependency, and Python scripts that call https://api.imastudio.com to list products, create tasks, and poll results. No unrelated services or secrets are requested.
Instruction Scope
SKILL.md and SKILL-DETAIL.md explicitly describe the exact HTTP calls (product list → create → poll) and UX behavior. Instructions do not request access to unrelated files, system configuration, or other credentials. The docs explicitly forbid exposing internal technical details to end users. The runtime scripts follow the documented flow.
Install Mechanism
There is no installer that downloads remote code; this is effectively an instruction-plus-local-scripts skill. The only required binary is python3 and requests is in requirements.txt — proportionate for a Python-based TTS client. No external arbitrary download URLs or archive extraction are present.
Credentials
Only one credential is required (IMA_API_KEY) and it is used solely for Authorization to api.imastudio.com per docs and code. Caveat: the skill writes operational logs to ~/.openclaw/logs/ima_skills/ and stores per-user preferences in ~/.openclaw/memory/ima_prefs.json. While SECURITY.md asserts the API key is not written to repo files, user-provided prompt text or other metadata could be recorded in logs depending on runtime logging calls; logs are auto-cleaned after 7 days. Use a scoped/test key if you want to limit exposure.
Persistence & Privilege
The skill requests read/write only to its own preference and log paths under ~/.openclaw which is consistent with storing user preferences and operational logs. always:false and no modifications to other skills or global agent config are requested. Autonomous invocation is enabled by default (normal) but not escalated.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install ima-tts-ai
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /ima-tts-ai 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.8
IMA TTS AI 1.0.8 — Changelog - Added detailed documentation file (`SKILL-DETAIL.md`) covering API usage, all parameters, and best practices. - Introduced `!keywords.txt` file to provide extended keyword support and improve skill discoverability. - No changes to logic or interface; this update focuses on completeness and clarity of reference material.
v1.0.7
- Added SECURITY.md to document the security policy and vulnerability reporting process. - Updated SKILL.md with an explicit "requires" section outlining needed environment variables and credential handling. - Declared persistence details in SKILL.md, specifying file locations for preferences and logs, plus retention policy. - No changes were made to core logic or API usage.
v1.0.6
- Added Chinese keywords (e.g. "语音合成", "文字转语音", "TTS", "多模型") to improve discoverability. - Enhanced argument hint to support Chinese input ("[text to speak or 要朗读的文本]"). - Improved description for clearer usage scenarios, now includes Chinese explanations (TTS, 朗读, 语音合成, 配音, 有声内容). - Clarified that the skill only supports seed-tts-2.0 (not seed-tts-1.1); default model is seed-tts-2.0. - No changes to core functionality or integration flow.
v1.0.5
No user-facing changes (no file changes detected). Version bump only.
v1.0.4
No user-facing changes (no file changes detected). Version bump only.
v1.0.3
No user-facing changes detected in this version. - Version bump without detected file changes. - Functionality and documentation remain the same.
v1.0.2
ima-tts-ai 1.0.2 - Added CHANGELOG_CLAWHUB.md and clawhub.json for improved metadata and changelog tracking. - No changes to core logic or APIs. This update focuses on project infrastructure and documentation. - No changes to user-facing functionality.
v1.0.1
- Added bundled timbre list file (volcengine_tts_timbre_list.json) for seed-tts-2.0 speaker selection. - Clarified skill targets only seed-tts-2.0 model (seed-tts-1.1 is not supported); default model is now seed-tts-2.0. - Updated documentation to reference volcengine_tts_timbre_list.json for full speaker options. - No changes to API usage or flow; functionality unchanged.
v1.0.0
IMA Studio TTS 1.0.0 — Initial Release - Enables text-to-speech (TTS) audio generation via IMA Open API, returning a playable audio URL (mp3/wav). - Requires always querying the model/product list (`/open/v1/product/list?category=text_to_speech`) to find valid model parameters before each request. - Handles creation and polling of TTS tasks, including retries on failure. - Maps user preferences (voice, speed, etc.) and remembers them locally per user. - Provides structured user experience guidelines: acknowledgment, progress updates, and result delivery with model and credit info. - Requires an IMA API key for operation.
元数据
Slug ima-tts-ai
版本 1.0.8
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 9
常见问题

IMA AI Text To Speech — seed-tts, DouBao 是什么?

Convert text, scripts, and captions into natural voiceovers for videos, explainers, product demos, and social posts. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 630 次。

如何安装 IMA AI Text To Speech — seed-tts, DouBao?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install ima-tts-ai」即可一键安装,无需额外配置。

IMA AI Text To Speech — seed-tts, DouBao 是免费的吗?

是的,IMA AI Text To Speech — seed-tts, DouBao 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

IMA AI Text To Speech — seed-tts, DouBao 支持哪些平台?

IMA AI Text To Speech — seed-tts, DouBao 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 IMA AI Text To Speech — seed-tts, DouBao?

由 allenfancy-gan(@allenfancy-gan)开发并维护,当前版本 v1.0.8。

💬 留言讨论