Description

Convert text, scripts, and captions into natural voiceovers for videos, explainers, product demos, and social posts.

README (SKILL.md)

IMA TTS AI — Text-to-Speech Generator

Name: IMA AI Text To Speech — seed-tts, DouBao
Author: allenfancy-gan

For complete API documentation, security details, all parameters, speaker list, and Python examples, read SKILL-DETAIL.md.

Model ID Reference (CRITICAL)

Friendly Name	model_id	Notes
Seed TTS 2.0	`seed-tts-2.0`	✅ Default and only supported model

Sub-models (via extra-params):

seed-tts-2.0-expressive — More expressive, emotional (default)
seed-tts-2.0-standard — More stable, neutral

When User Says "帮我制作旁白/配音"

Must ask first:

Question	Parameter	Required
要朗读的内容/文案	`prompt`	✅ Yes

Recommend asking:

Question	Parameter	Options
音色/发音人	`speaker`	魅力苏菲、Vivi、云舟、大壹等 (see SKILL-DETAIL.md)

Optional:

Question	Parameter	Range
情感/情绪	`audio_params.emotion`	neutral, sad, angry
语速	`audio_params.speech_rate`	[-50, 100], 0=normal
音量	`audio_params.loudness_rate`	[-50, 100], 0=normal

User Input Parsing

User says	Parameter	Value
旁白/配音/朗读	prompt + speaker	Ask for content first
女声/female	speaker	e.g. `zh_female_vv_uranus_bigtts`
男声/male	speaker	e.g. `zh_male_sophie_uranus_bigtts`
语速快/slow	audio_params.speech_rate	Positive/negative value
expressive/standard	model	Sub-model selection

Script Usage

# List available TTS models
python3 {baseDir}/scripts/ima_tts_create.py --api-key $IMA_API_KEY --list-models

# Generate speech (default model: seed-tts-2.0)
python3 {baseDir}/scripts/ima_tts_create.py \
  --api-key $IMA_API_KEY \
  --model-id seed-tts-2.0 \
  --prompt "Text to be spoken here." \
  --user-id {user_id} \
  --output-json

# With speaker and emotion
python3 {baseDir}/scripts/ima_tts_create.py \
  --api-key $IMA_API_KEY \
  --model-id seed-tts-2.0 \
  --prompt "阳光青年音色测试，你好世界。" \
  --extra-params '{"model":"seed-tts-2.0-expressive","speaker":"zh_male_sophie_uranus_bigtts","audio_params":{"emotion":"neutral"}}' \
  --user-id {user_id} \
  --output-json

Sending Results to User

# ✅ CORRECT: Use remote URL directly
message(action="send", media=audio_url, caption="✅ 语音合成成功！\
• 模型：[Name]\
• 耗时：[X]s\
• 积分：[N pts]\
\
🔗 原始链接：[url]")

# ❌ WRONG: Never download to local file

UX Protocol (Brief)

Pre-generation: "🔊 开始语音合成… 模型：[Name]，预计[X~Y]秒，消耗[N]积分"
Progress: Every 10-15s: "⏳ 语音合成中… [P]%"
Success: Send audio via media=audio_url + include link in caption
Failure: Natural language error + suggest retry. See SKILL-DETAIL.md for error translation.

Never say to users: script names, API endpoints, attribute_id, technical parameter names.

Environment

Base URL: https://api.imastudio.com Headers: Authorization: Bearer $IMA_API_KEY · x-app-source: ima_skills · x_app_language: en

Core Flow

GET /open/v1/product/list?app=ima&platform=web&category=text_to_speech → get attribute_id, credit
POST /open/v1/tasks/create → get task_id
POST /open/v1/tasks/detail → poll every 2-5s until resource_status==1

MANDATORY: Always query product list first. attribute_id is required.

Estimated Generation Time

Model	Estimated Time	Poll Every
seed-tts-2.0	5~30s	3s

User Preference Memory

Storage: ~/.openclaw/memory/ima_prefs.json

Save when user explicitly says "用XXX音色" / "默认用XXX"
Clear when user says "换个音色" / "推荐一个"

Popular Speakers (Quick Reference)

Category	Speaker Name	speaker ID
通用	魅力苏菲	`zh_male_sophie_uranus_bigtts`
通用	Vivi	`zh_female_vv_uranus_bigtts`
通用	云舟	`zh_male_m191_uranus_bigtts`
视频配音	大壹	`zh_male_dayi_uranus_bigtts`
角色扮演	知性灿灿	`zh_female_cancan_uranus_bigtts`

Full speaker list: See volcengine_tts_timbre_list.json in project or SKILL-DETAIL.md.

⚠️ Important: Use native format (*_uranus_bigtts), NOT BV*_streaming format.

Usage Guidance

This skill appears to do what it advertises: it sends your text to IMA Studio's TTS API using the IMA_API_KEY and returns an audio URL. Before installing: 1) Prefer a scoped or test API key if you're concerned about exposure. 2) Be aware the skill writes logs (~/.openclaw/logs/ima_skills/) and a preferences JSON (~/.openclaw/memory/ima_prefs.json); logs are auto-deleted after 7 days but may contain non-sensitive metadata (verify no prompts or secrets are logged if that matters). 3) Confirm you trust https://imastudio.com / api.imastudio.com as the destination for your API key. 4) The skill requires python3 and requests — install those in a controlled environment. If you want greater assurance, inspect the full ima_tts_create.py to verify it never logs the API key or writes user prompts to disk before using the skill.

Capability Analysis

Type: OpenClaw Skill Name: ima-tts-ai Version: 1.0.8 The IMA TTS Generator skill is a legitimate tool for converting text to speech via the IMA Studio API (api.imastudio.com). The Python scripts (ima_tts_create.py and ima_logger.py) implement a standard API interaction flow: querying product lists, creating tasks, and polling for results. Sensitive data handling is limited to the IMA_API_KEY, which is sent only to the official endpoint. Local file operations are restricted to the ~/.openclaw directory for managing user preferences and logs (with a 7-day auto-cleanup policy). The SKILL.md instructions provide appropriate UX guidelines for the AI agent without any evidence of prompt injection or malicious intent.

Capability Assessment

✓ Purpose & Capability

Name/description (TTS) matches the required artifacts: a single IMA_API_KEY credential, python3 runtime, requests dependency, and Python scripts that call https://api.imastudio.com to list products, create tasks, and poll results. No unrelated services or secrets are requested.

✓ Instruction Scope

SKILL.md and SKILL-DETAIL.md explicitly describe the exact HTTP calls (product list → create → poll) and UX behavior. Instructions do not request access to unrelated files, system configuration, or other credentials. The docs explicitly forbid exposing internal technical details to end users. The runtime scripts follow the documented flow.

✓ Install Mechanism

There is no installer that downloads remote code; this is effectively an instruction-plus-local-scripts skill. The only required binary is python3 and requests is in requirements.txt — proportionate for a Python-based TTS client. No external arbitrary download URLs or archive extraction are present.

ℹ Credentials

Only one credential is required (IMA_API_KEY) and it is used solely for Authorization to api.imastudio.com per docs and code. Caveat: the skill writes operational logs to ~/.openclaw/logs/ima_skills/ and stores per-user preferences in ~/.openclaw/memory/ima_prefs.json. While SECURITY.md asserts the API key is not written to repo files, user-provided prompt text or other metadata could be recorded in logs depending on runtime logging calls; logs are auto-cleaned after 7 days. Use a scoped/test key if you want to limit exposure.

✓ Persistence & Privilege

The skill requests read/write only to its own preference and log paths under ~/.openclaw which is consistent with storing user preferences and operational logs. always:false and no modifications to other skills or global agent config are requested. Autonomous invocation is enabled by default (normal) but not escalated.

Version History

v1.0.8

IMA TTS AI 1.0.8 — Changelog - Added detailed documentation file (`SKILL-DETAIL.md`) covering API usage, all parameters, and best practices. - Introduced `!keywords.txt` file to provide extended keyword support and improve skill discoverability. - No changes to logic or interface; this update focuses on completeness and clarity of reference material.

v1.0.7

- Added SECURITY.md to document the security policy and vulnerability reporting process. - Updated SKILL.md with an explicit "requires" section outlining needed environment variables and credential handling. - Declared persistence details in SKILL.md, specifying file locations for preferences and logs, plus retention policy. - No changes were made to core logic or API usage.

v1.0.6

- Added Chinese keywords (e.g. "语音合成", "文字转语音", "TTS", "多模型") to improve discoverability. - Enhanced argument hint to support Chinese input ("[text to speak or 要朗读的文本]"). - Improved description for clearer usage scenarios, now includes Chinese explanations (TTS, 朗读, 语音合成, 配音, 有声内容). - Clarified that the skill only supports seed-tts-2.0 (not seed-tts-1.1); default model is seed-tts-2.0. - No changes to core functionality or integration flow.

v1.0.5

No user-facing changes (no file changes detected). Version bump only.

v1.0.4

No user-facing changes (no file changes detected). Version bump only.

v1.0.3

No user-facing changes detected in this version. - Version bump without detected file changes. - Functionality and documentation remain the same.

v1.0.2

ima-tts-ai 1.0.2 - Added CHANGELOG_CLAWHUB.md and clawhub.json for improved metadata and changelog tracking. - No changes to core logic or APIs. This update focuses on project infrastructure and documentation. - No changes to user-facing functionality.

v1.0.1

- Added bundled timbre list file (volcengine_tts_timbre_list.json) for seed-tts-2.0 speaker selection. - Clarified skill targets only seed-tts-2.0 model (seed-tts-1.1 is not supported); default model is now seed-tts-2.0. - Updated documentation to reference volcengine_tts_timbre_list.json for full speaker options. - No changes to API usage or flow; functionality unchanged.

v1.0.0

IMA Studio TTS 1.0.0 — Initial Release - Enables text-to-speech (TTS) audio generation via IMA Open API, returning a playable audio URL (mp3/wav). - Requires always querying the model/product list (`/open/v1/product/list?category=text_to_speech`) to find valid model parameters before each request. - Handles creation and polling of TTS tasks, including retries on failure. - Maps user preferences (voice, speed, etc.) and remembers them locally per user. - Provides structured user experience guidelines: acknowledgment, progress updates, and result delivery with model and credit info. - Requires an IMA API key for operation.

Metadata

Slug ima-tts-ai

Version 1.0.8

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 9

Frequently Asked Questions

What is IMA AI Text To Speech — seed-tts, DouBao?

Convert text, scripts, and captions into natural voiceovers for videos, explainers, product demos, and social posts. It is an AI Agent Skill for Claude Code / OpenClaw, with 630 downloads so far.

How do I install IMA AI Text To Speech — seed-tts, DouBao?

Run "/install ima-tts-ai" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is IMA AI Text To Speech — seed-tts, DouBao free?

Yes, IMA AI Text To Speech — seed-tts, DouBao is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does IMA AI Text To Speech — seed-tts, DouBao support?

IMA AI Text To Speech — seed-tts, DouBao is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created IMA AI Text To Speech — seed-tts, DouBao?

It is built and maintained by allenfancy-gan (@allenfancy-gan); the current version is v1.0.8.

More Skills

IMA AI Text To Speech — seed-tts, DouBao