← Back to Skills Marketplace
allenfancy-gan

IMA AI Text To Speech — seed-tts, DouBao

by allenfancy-gan · GitHub ↗ · v1.0.8 · MIT-0
cross-platform ✓ Security Clean
630
Downloads
0
Stars
0
Active Installs
9
Versions
Install in OpenClaw
/install ima-tts-ai
Description
Convert text, scripts, and captions into natural voiceovers for videos, explainers, product demos, and social posts.
README (SKILL.md)

IMA TTS AI — Text-to-Speech Generator

For complete API documentation, security details, all parameters, speaker list, and Python examples, read SKILL-DETAIL.md.

Model ID Reference (CRITICAL)

Friendly Name model_id Notes
Seed TTS 2.0 seed-tts-2.0 ✅ Default and only supported model

Sub-models (via extra-params):

  • seed-tts-2.0-expressive — More expressive, emotional (default)
  • seed-tts-2.0-standard — More stable, neutral

When User Says "帮我制作旁白/配音"

Must ask first:

Question Parameter Required
要朗读的内容/文案 prompt ✅ Yes

Recommend asking:

Question Parameter Options
音色/发音人 speaker 魅力苏菲、Vivi、云舟、大壹 等 (see SKILL-DETAIL.md)

Optional:

Question Parameter Range
情感/情绪 audio_params.emotion neutral, sad, angry
语速 audio_params.speech_rate [-50, 100], 0=normal
音量 audio_params.loudness_rate [-50, 100], 0=normal

User Input Parsing

User says Parameter Value
旁白/配音/朗读 prompt + speaker Ask for content first
女声/female speaker e.g. zh_female_vv_uranus_bigtts
男声/male speaker e.g. zh_male_sophie_uranus_bigtts
语速快/slow audio_params.speech_rate Positive/negative value
expressive/standard model Sub-model selection

Script Usage

# List available TTS models
python3 {baseDir}/scripts/ima_tts_create.py --api-key $IMA_API_KEY --list-models

# Generate speech (default model: seed-tts-2.0)
python3 {baseDir}/scripts/ima_tts_create.py \
  --api-key $IMA_API_KEY \
  --model-id seed-tts-2.0 \
  --prompt "Text to be spoken here." \
  --user-id {user_id} \
  --output-json

# With speaker and emotion
python3 {baseDir}/scripts/ima_tts_create.py \
  --api-key $IMA_API_KEY \
  --model-id seed-tts-2.0 \
  --prompt "阳光青年音色测试,你好世界。" \
  --extra-params '{"model":"seed-tts-2.0-expressive","speaker":"zh_male_sophie_uranus_bigtts","audio_params":{"emotion":"neutral"}}' \
  --user-id {user_id} \
  --output-json

Sending Results to User

# ✅ CORRECT: Use remote URL directly
message(action="send", media=audio_url, caption="✅ 语音合成成功!\
• 模型:[Name]\
• 耗时:[X]s\
• 积分:[N pts]\
\
🔗 原始链接:[url]")

# ❌ WRONG: Never download to local file

UX Protocol (Brief)

  1. Pre-generation: "🔊 开始语音合成… 模型:[Name],预计[X~Y]秒,消耗[N]积分"
  2. Progress: Every 10-15s: "⏳ 语音合成中… [P]%"
  3. Success: Send audio via media=audio_url + include link in caption
  4. Failure: Natural language error + suggest retry. See SKILL-DETAIL.md for error translation.

Never say to users: script names, API endpoints, attribute_id, technical parameter names.

Environment

Base URL: https://api.imastudio.com Headers: Authorization: Bearer $IMA_API_KEY · x-app-source: ima_skills · x_app_language: en

Core Flow

  1. GET /open/v1/product/list?app=ima&platform=web&category=text_to_speech → get attribute_id, credit
  2. POST /open/v1/tasks/create → get task_id
  3. POST /open/v1/tasks/detail → poll every 2-5s until resource_status==1

MANDATORY: Always query product list first. attribute_id is required.

Estimated Generation Time

Model Estimated Time Poll Every
seed-tts-2.0 5~30s 3s

User Preference Memory

Storage: ~/.openclaw/memory/ima_prefs.json

  • Save when user explicitly says "用XXX音色" / "默认用XXX"
  • Clear when user says "换个音色" / "推荐一个"

Popular Speakers (Quick Reference)

Category Speaker Name speaker ID
通用 魅力苏菲 zh_male_sophie_uranus_bigtts
通用 Vivi zh_female_vv_uranus_bigtts
通用 云舟 zh_male_m191_uranus_bigtts
视频配音 大壹 zh_male_dayi_uranus_bigtts
角色扮演 知性灿灿 zh_female_cancan_uranus_bigtts

Full speaker list: See volcengine_tts_timbre_list.json in project or SKILL-DETAIL.md.

⚠️ Important: Use native format (*_uranus_bigtts), NOT BV*_streaming format.

Usage Guidance
This skill appears to do what it advertises: it sends your text to IMA Studio's TTS API using the IMA_API_KEY and returns an audio URL. Before installing: 1) Prefer a scoped or test API key if you're concerned about exposure. 2) Be aware the skill writes logs (~/.openclaw/logs/ima_skills/) and a preferences JSON (~/.openclaw/memory/ima_prefs.json); logs are auto-deleted after 7 days but may contain non-sensitive metadata (verify no prompts or secrets are logged if that matters). 3) Confirm you trust https://imastudio.com / api.imastudio.com as the destination for your API key. 4) The skill requires python3 and requests — install those in a controlled environment. If you want greater assurance, inspect the full ima_tts_create.py to verify it never logs the API key or writes user prompts to disk before using the skill.
Capability Analysis
Type: OpenClaw Skill Name: ima-tts-ai Version: 1.0.8 The IMA TTS Generator skill is a legitimate tool for converting text to speech via the IMA Studio API (api.imastudio.com). The Python scripts (ima_tts_create.py and ima_logger.py) implement a standard API interaction flow: querying product lists, creating tasks, and polling for results. Sensitive data handling is limited to the IMA_API_KEY, which is sent only to the official endpoint. Local file operations are restricted to the ~/.openclaw directory for managing user preferences and logs (with a 7-day auto-cleanup policy). The SKILL.md instructions provide appropriate UX guidelines for the AI agent without any evidence of prompt injection or malicious intent.
Capability Assessment
Purpose & Capability
Name/description (TTS) matches the required artifacts: a single IMA_API_KEY credential, python3 runtime, requests dependency, and Python scripts that call https://api.imastudio.com to list products, create tasks, and poll results. No unrelated services or secrets are requested.
Instruction Scope
SKILL.md and SKILL-DETAIL.md explicitly describe the exact HTTP calls (product list → create → poll) and UX behavior. Instructions do not request access to unrelated files, system configuration, or other credentials. The docs explicitly forbid exposing internal technical details to end users. The runtime scripts follow the documented flow.
Install Mechanism
There is no installer that downloads remote code; this is effectively an instruction-plus-local-scripts skill. The only required binary is python3 and requests is in requirements.txt — proportionate for a Python-based TTS client. No external arbitrary download URLs or archive extraction are present.
Credentials
Only one credential is required (IMA_API_KEY) and it is used solely for Authorization to api.imastudio.com per docs and code. Caveat: the skill writes operational logs to ~/.openclaw/logs/ima_skills/ and stores per-user preferences in ~/.openclaw/memory/ima_prefs.json. While SECURITY.md asserts the API key is not written to repo files, user-provided prompt text or other metadata could be recorded in logs depending on runtime logging calls; logs are auto-cleaned after 7 days. Use a scoped/test key if you want to limit exposure.
Persistence & Privilege
The skill requests read/write only to its own preference and log paths under ~/.openclaw which is consistent with storing user preferences and operational logs. always:false and no modifications to other skills or global agent config are requested. Autonomous invocation is enabled by default (normal) but not escalated.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install ima-tts-ai
  3. After installation, invoke the skill by name or use /ima-tts-ai
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.8
IMA TTS AI 1.0.8 — Changelog - Added detailed documentation file (`SKILL-DETAIL.md`) covering API usage, all parameters, and best practices. - Introduced `!keywords.txt` file to provide extended keyword support and improve skill discoverability. - No changes to logic or interface; this update focuses on completeness and clarity of reference material.
v1.0.7
- Added SECURITY.md to document the security policy and vulnerability reporting process. - Updated SKILL.md with an explicit "requires" section outlining needed environment variables and credential handling. - Declared persistence details in SKILL.md, specifying file locations for preferences and logs, plus retention policy. - No changes were made to core logic or API usage.
v1.0.6
- Added Chinese keywords (e.g. "语音合成", "文字转语音", "TTS", "多模型") to improve discoverability. - Enhanced argument hint to support Chinese input ("[text to speak or 要朗读的文本]"). - Improved description for clearer usage scenarios, now includes Chinese explanations (TTS, 朗读, 语音合成, 配音, 有声内容). - Clarified that the skill only supports seed-tts-2.0 (not seed-tts-1.1); default model is seed-tts-2.0. - No changes to core functionality or integration flow.
v1.0.5
No user-facing changes (no file changes detected). Version bump only.
v1.0.4
No user-facing changes (no file changes detected). Version bump only.
v1.0.3
No user-facing changes detected in this version. - Version bump without detected file changes. - Functionality and documentation remain the same.
v1.0.2
ima-tts-ai 1.0.2 - Added CHANGELOG_CLAWHUB.md and clawhub.json for improved metadata and changelog tracking. - No changes to core logic or APIs. This update focuses on project infrastructure and documentation. - No changes to user-facing functionality.
v1.0.1
- Added bundled timbre list file (volcengine_tts_timbre_list.json) for seed-tts-2.0 speaker selection. - Clarified skill targets only seed-tts-2.0 model (seed-tts-1.1 is not supported); default model is now seed-tts-2.0. - Updated documentation to reference volcengine_tts_timbre_list.json for full speaker options. - No changes to API usage or flow; functionality unchanged.
v1.0.0
IMA Studio TTS 1.0.0 — Initial Release - Enables text-to-speech (TTS) audio generation via IMA Open API, returning a playable audio URL (mp3/wav). - Requires always querying the model/product list (`/open/v1/product/list?category=text_to_speech`) to find valid model parameters before each request. - Handles creation and polling of TTS tasks, including retries on failure. - Maps user preferences (voice, speed, etc.) and remembers them locally per user. - Provides structured user experience guidelines: acknowledgment, progress updates, and result delivery with model and credit info. - Requires an IMA API key for operation.
Metadata
Slug ima-tts-ai
Version 1.0.8
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 9
Frequently Asked Questions

What is IMA AI Text To Speech — seed-tts, DouBao?

Convert text, scripts, and captions into natural voiceovers for videos, explainers, product demos, and social posts. It is an AI Agent Skill for Claude Code / OpenClaw, with 630 downloads so far.

How do I install IMA AI Text To Speech — seed-tts, DouBao?

Run "/install ima-tts-ai" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is IMA AI Text To Speech — seed-tts, DouBao free?

Yes, IMA AI Text To Speech — seed-tts, DouBao is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does IMA AI Text To Speech — seed-tts, DouBao support?

IMA AI Text To Speech — seed-tts, DouBao is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created IMA AI Text To Speech — seed-tts, DouBao?

It is built and maintained by allenfancy-gan (@allenfancy-gan); the current version is v1.0.8.

💬 Comments