← 返回 Skills 市场
cinience

Aliyun Qwen Tts

作者 cinience · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
107
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install aliyun-qwen-tts
功能描述
Use when generating human-like speech audio with Model Studio DashScope Qwen TTS models (qwen3-tts-flash, qwen3-tts-instruct-flash). Use when converting text...
使用说明 (SKILL.md)

Category: provider

Model Studio Qwen TTS

Validation

mkdir -p output/aliyun-qwen-tts
python -m py_compile skills/ai/audio/aliyun-qwen-tts/scripts/generate_tts.py && echo "py_compile_ok" > output/aliyun-qwen-tts/validate.txt

Pass criteria: command exits 0 and output/aliyun-qwen-tts/validate.txt is generated.

Output And Evidence

  • Save generated audio links, sample audio files, and request payloads to output/aliyun-qwen-tts/.
  • Keep one validation log per execution.

Critical model names

Use one of the recommended models:

  • qwen3-tts-flash
  • qwen3-tts-instruct-flash
  • qwen3-tts-instruct-flash-2026-01-26

Prerequisites

  • Install SDK (recommended in a venv to avoid PEP 668 limits):
python3 -m venv .venv
. .venv/bin/activate
python -m pip install dashscope
  • Set DASHSCOPE_API_KEY in your environment, or add dashscope_api_key to ~/.alibabacloud/credentials (env takes precedence).

Normalized interface (tts.generate)

Request

  • text (string, required)
  • voice (string, required)
  • language_type (string, optional; default Auto)
  • instruction (string, optional; recommended for instruct models)
  • stream (bool, optional; default false)

Response

  • audio_url (string, when stream=false)
  • audio_base64_pcm (string, when stream=true)
  • sample_rate (int, 24000)
  • format (string, wav or pcm depending on mode)

Quick start (Python + DashScope SDK)

import os
import dashscope

# Prefer env var for auth: export DASHSCOPE_API_KEY=...
# Or use ~/.alibabacloud/credentials with dashscope_api_key under [default].
# Beijing region; for Singapore use: https://dashscope-intl.aliyuncs.com/api/v1
dashscope.base_http_api_url = "https://dashscope.aliyuncs.com/api/v1"

text = "Hello, this is a short voice line."
response = dashscope.MultiModalConversation.call(
    model="qwen3-tts-instruct-flash",
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    text=text,
    voice="Cherry",
    language_type="English",
    instruction="Warm and calm tone, slightly slower pace.",
    stream=False,
)

audio_url = response.output.audio.url
print(audio_url)

Streaming notes

  • stream=True returns Base64-encoded PCM chunks at 24kHz.
  • Decode chunks and play or concatenate to a pcm buffer.
  • The response contains finish_reason == "stop" when the stream ends.

Operational guidance

  • Keep requests concise; split long text into multiple calls if you hit size or timeout errors.
  • Use language_type consistent with the text to improve pronunciation.
  • Use instruction only when you need explicit style/tone control.
  • Cache by (text, voice, language_type) to avoid repeat costs.

Output location

  • Default output: output/aliyun-qwen-tts/audio/
  • Override base dir with OUTPUT_DIR.

Workflow

  1. Confirm user intent, region, identifiers, and whether the operation is read-only or mutating.
  2. Run one minimal read-only query first to verify connectivity and permissions.
  3. Execute the target operation with explicit parameters and bounded scope.
  4. Verify results and save output/evidence files.

References

  • references/api_reference.md for parameter mapping and streaming example.

  • Realtime mode is provided by skills/ai/audio/aliyun-qwen-tts-realtime/.

  • Voice cloning/design are provided by skills/ai/audio/aliyun-qwen-tts-voice-clone/ and skills/ai/audio/aliyun-qwen-tts-voice-design/.

  • Source list: references/sources.md

安全使用建议
Before installing or running this skill: (1) Expect to provide a DASHSCOPE_API_KEY — the registry metadata omits this, so confirm you are comfortable supplying that secret. (2) The script will try to read .env files and ~/.alibabacloud/credentials; avoid keeping high-privilege or unrelated secrets in those files or run the skill in a sandbox. (3) If you must install the 'dashscope' Python package, install it in an isolated venv and verify the package source (PyPI or vendor site). (4) Prefer creating a dedicated DashScope API key with minimal scope for TTS usage, and rotate/revoke it if you stop using the skill. (5) If you need higher assurance, review the included generate_tts.py yourself or run the skill in a controlled environment (no sensitive credentials present) to observe its network calls and output.
功能分析
Type: OpenClaw Skill Name: aliyun-qwen-tts Version: 1.0.0 The skill is a legitimate integration for Alibaba Cloud's Qwen TTS service. It uses the official DashScope SDK and follows standard practices for credential management, including reading from environment variables and the '~/.alibabacloud/credentials' file. The core logic in 'scripts/generate_tts.py' is focused on generating and downloading audio files as described in 'SKILL.md', with no evidence of malicious intent, data exfiltration, or harmful prompt injection.
能力评估
Purpose & Capability
Name/description, SKILL.md, and the Python script all align: this is a DashScope (Alibaba) Qwen TTS client. However the registry metadata lists no required environment variables or primary credential while the instructions and script require DASHSCOPE_API_KEY (or credentials from ~/.alibabacloud/credentials). That mismatch is an incoherence the user should be aware of.
Instruction Scope
Runtime instructions and the script stay within TTS functionality (compose request, call DashScope API, download audio, save outputs). The script intentionally reads .env files in cwd and in the repo root and will read ~/.alibabacloud/credentials (and honor ALIBABA_CLOUD_PROFILE / ALICLOUD_PROFILE). These file accesses are expected for fetching the API key but are broader than the declared registry requirements.
Install Mechanism
No install spec is embedded (instruction-only skill). SKILL.md recommends installing the 'dashscope' Python SDK via pip in a venv. This is standard, but installing third-party packages carries normal supply-chain risk; the package source should be verified.
Credentials
The skill requires an API key (DASHSCOPE_API_KEY) and reads ~/.alibabacloud/credentials and .env files, but the registry metadata declares no required env vars or primary credential. It also uses OUTPUT_DIR and honors ALIBABA_CLOUD_PROFILE / ALICLOUD_PROFILE implicitly. Requesting access to local credential files is proportionate to calling the DashScope API, but the missing declaration is a notable inconsistency.
Persistence & Privilege
The skill is not marked 'always: true', does not modify other skills or system-wide config, and relies on explicitly provided credentials. Autonomous invocation is allowed by default but not combined with other high-risk flags here.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install aliyun-qwen-tts
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /aliyun-qwen-tts 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release of aliyun-qwen-tts. - Provides text-to-speech conversion using DashScope Qwen TTS models (qwen3-tts-flash, qwen3-tts-instruct-flash). - Supports audio generation for short videos, news, and related use cases. - Outlines request/response fields and setup steps for the DashScope SDK. - Includes both standard and streaming TTS options. - Offers operational tips, output conventions, and validation instructions.
元数据
Slug aliyun-qwen-tts
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

Aliyun Qwen Tts 是什么?

Use when generating human-like speech audio with Model Studio DashScope Qwen TTS models (qwen3-tts-flash, qwen3-tts-instruct-flash). Use when converting text... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 107 次。

如何安装 Aliyun Qwen Tts?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install aliyun-qwen-tts」即可一键安装,无需额外配置。

Aliyun Qwen Tts 是免费的吗?

是的,Aliyun Qwen Tts 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Aliyun Qwen Tts 支持哪些平台?

Aliyun Qwen Tts 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Aliyun Qwen Tts?

由 cinience(@cinience)开发并维护,当前版本 v1.0.0。

💬 留言讨论