← 返回 Skills 市场
yhsi5358

ComfyUI TTS

作者 YHSI5358 · GitHub ↗ · v1.0.0
cross-platform ⚠ suspicious
908
总下载
0
收藏
2
当前安装
1
版本数
在 OpenClaw 中安装
/install comfyui-tts
功能描述
Convert text to speech audio via ComfyUI's Qwen-TTS API, supporting customizable voice, style, model, and output options.
使用说明 (SKILL.md)

ComfyUI TTS Skill

Generate speech audio using ComfyUI's Qwen-TTS service. This skill allows you to convert text to speech through ComfyUI's API.

Configuration

Environment Variables

Set these environment variables to configure the ComfyUI connection:

export COMFYUI_HOST="localhost"      # ComfyUI server host
export COMFYUI_PORT="8188"           # ComfyUI server port
export COMFYUI_OUTPUT_DIR=""         # Optional: Custom output directory

Usage

Basic Text-to-Speech

Generate audio from text using default settings:

scripts/tts.sh "你好,世界"

Advanced Options

Customize voice characteristics:

# Specify character and style
scripts/tts.sh "你好" --character "Girl" --style "Emotional"

# Change model size
scripts/tts.sh "你好" --model "3B"

# Specify output file
scripts/tts.sh "你好" --output "/path/to/output.wav"

# Combine options
scripts/tts.sh "你好,这是测试" \
  --character "Girl" \
  --style "Emotional" \
  --model "1.7B" \
  --output "~/audio/test.wav"

Available Options

Option Description Default
--character Voice character (Girl/Boy/etc.) "Girl"
--style Speaking style (Emotional/Neutral/etc.) "Emotional"
--model Model size (0.5B/1.7B/3B) "1.7B"
--output Output file path Auto-generated
--temperature Generation temperature (0-1) 0.9
--top-p Top-p sampling 0.9
--top-k Top-k sampling 50

Workflow

The skill performs these steps:

  1. Construct Workflow: Builds a ComfyUI workflow JSON with your text and settings
  2. Submit Job: Sends the workflow to ComfyUI's /prompt endpoint
  3. Poll Status: Monitors job completion via /history endpoint
  4. Retrieve Audio: Returns the path to the generated audio file

Troubleshooting

Connection Refused

  • Verify ComfyUI is running: curl http://$COMFYUI_HOST:$COMFYUI_PORT/system_stats
  • Check host and port settings

Job Timeout

  • Large models (3B) take longer to generate
  • Try smaller models (0.5B, 1.7B) for faster results

Output Not Found

  • Check ComfyUI's output directory configuration
  • Verify file permissions

API Reference

The skill uses ComfyUI's native API endpoints:

  • POST /prompt - Submit workflow
  • GET /history - Check job status
  • Output files are saved to ComfyUI's configured output directory
安全使用建议
This skill appears to do what it says: it sends TTS jobs to a ComfyUI server and downloads resulting audio. Before installing or running: (1) verify you intend to connect to the configured COMFYUI_HOST/COMFYUI_PORT — default is localhost; avoid pointing it at untrusted public hosts; (2) review the included scripts (scripts/tts.sh) if you have stricter security requirements; (3) be aware generated audio files are referenced by the ComfyUI output directory and the script may download files to paths you supply; (4) note the SKILL.md mentions environment variables (COMFYUI_HOST, COMFYUI_PORT, COMFYUI_OUTPUT_DIR) but the registry metadata did not declare them — set these explicitly as needed. If you plan to run against a remote ComfyUI instance, ensure that instance is trusted, since the script will send the text you provide to that server.
功能分析
Type: OpenClaw Skill Name: comfyui-tts Version: 1.0.0 The `scripts/tts.sh` file contains significant vulnerabilities. Several parameters (e.g., `--character`, `--style`, `--model`) are directly interpolated into the JSON workflow sent to the ComfyUI API without proper sanitization, leading to a JSON injection vulnerability. An attacker controlling these inputs could inject arbitrary JSON into the ComfyUI prompt. Additionally, the `--output` argument is used directly for file download and directory creation, posing a path traversal vulnerability that could allow writing files to arbitrary locations on the agent's filesystem. While these are critical flaws, there is no clear evidence of intentional malicious behavior (e.g., data exfiltration, backdoor installation) within the provided code, classifying it as suspicious due to the exploitable vulnerabilities.
能力评估
Purpose & Capability
Name/description (ComfyUI TTS) match the delivered artifacts: two shell scripts implement submitting a workflow to ComfyUI, polling /history, and retrieving audio. Required binaries (curl, jq) are reasonable for the stated purpose.
Instruction Scope
Runtime instructions and the scripts focus on contacting the ComfyUI endpoints (/prompt, /history, /view) and handling audio files. Minor inconsistency: SKILL.md documents environment variables (COMFYUI_HOST, COMFYUI_PORT, COMFYUI_OUTPUT_DIR) that are used by the scripts but were not declared in the skill's registry 'required env vars' metadata — this is informational, not evidence of hidden behavior. The scripts do not read unrelated system files or transmit data to third-party hosts beyond the configured COMFYUI_URL.
Install Mechanism
No install spec; this is instruction-only with included shell scripts. No downloads or remote installers are used, so nothing arbitrary is fetched or written by the skill itself during installation.
Credentials
The skill requests no credentials and only uses optional environment variables for the ComfyUI host/port/output. The lack of declared required env vars in registry metadata is a small mismatch with SKILL.md but not disproportionate: the env vars merely point the script at a ComfyUI server and do not grant access to unrelated services or secrets.
Persistence & Privilege
always is false and the skill does not request persistent/privileged system changes. It does not attempt to modify other skills or global agent settings.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install comfyui-tts
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /comfyui-tts 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release of comfyui-tts – generate speech audio with ComfyUI Qwen-TTS. - Provides a shell script to convert text to speech using ComfyUI's Qwen-TTS service. - Supports customizable options: character, style, model size, output path, and sampling parameters. - Requires curl and jq; configurable via environment variables for host, port, and output directory. - Automatically submits jobs to ComfyUI, monitors completion, and retrieves audio files. - Includes usage instructions, troubleshooting tips, and API endpoint references.
元数据
Slug comfyui-tts
版本 1.0.0
许可证
累计安装 2
当前安装数 2
历史版本数 1
常见问题

ComfyUI TTS 是什么?

Convert text to speech audio via ComfyUI's Qwen-TTS API, supporting customizable voice, style, model, and output options. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 908 次。

如何安装 ComfyUI TTS?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install comfyui-tts」即可一键安装,无需额外配置。

ComfyUI TTS 是免费的吗?

是的,ComfyUI TTS 完全免费(开源免费),可自由下载、安装和使用。

ComfyUI TTS 支持哪些平台?

ComfyUI TTS 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 ComfyUI TTS?

由 YHSI5358(@yhsi5358)开发并维护,当前版本 v1.0.0。

💬 留言讨论