← 返回 Skills 市场

ComfyUI TTS

Name: ComfyUI TTS
Author: yhsi5358

作者 YHSI5358 · GitHub ↗ · v1.0.0

cross-platform ⚠ suspicious

908

总下载

当前安装

版本数

在 OpenClaw 中安装

/install comfyui-tts

功能描述

Convert text to speech audio via ComfyUI's Qwen-TTS API, supporting customizable voice, style, model, and output options.

使用说明 (SKILL.md)

ComfyUI TTS Skill

Generate speech audio using ComfyUI's Qwen-TTS service. This skill allows you to convert text to speech through ComfyUI's API.

Configuration

Environment Variables

Set these environment variables to configure the ComfyUI connection:

export COMFYUI_HOST="localhost"      # ComfyUI server host
export COMFYUI_PORT="8188"           # ComfyUI server port
export COMFYUI_OUTPUT_DIR=""         # Optional: Custom output directory

Usage

Basic Text-to-Speech

Generate audio from text using default settings:

scripts/tts.sh "你好，世界"

Advanced Options

Customize voice characteristics:

# Specify character and style
scripts/tts.sh "你好" --character "Girl" --style "Emotional"

# Change model size
scripts/tts.sh "你好" --model "3B"

# Specify output file
scripts/tts.sh "你好" --output "/path/to/output.wav"

# Combine options
scripts/tts.sh "你好，这是测试" \
  --character "Girl" \
  --style "Emotional" \
  --model "1.7B" \
  --output "~/audio/test.wav"

Available Options

Option	Description	Default
`--character`	Voice character (Girl/Boy/etc.)	"Girl"
`--style`	Speaking style (Emotional/Neutral/etc.)	"Emotional"
`--model`	Model size (0.5B/1.7B/3B)	"1.7B"
`--output`	Output file path	Auto-generated
`--temperature`	Generation temperature (0-1)	0.9
`--top-p`	Top-p sampling	0.9
`--top-k`	Top-k sampling	50

Workflow

The skill performs these steps:

Construct Workflow: Builds a ComfyUI workflow JSON with your text and settings
Submit Job: Sends the workflow to ComfyUI's /prompt endpoint
Poll Status: Monitors job completion via /history endpoint
Retrieve Audio: Returns the path to the generated audio file

Troubleshooting

Connection Refused

Verify ComfyUI is running: curl http://$COMFYUI_HOST:$COMFYUI_PORT/system_stats
Check host and port settings

Job Timeout

Large models (3B) take longer to generate
Try smaller models (0.5B, 1.7B) for faster results

Output Not Found

Check ComfyUI's output directory configuration
Verify file permissions

API Reference

The skill uses ComfyUI's native API endpoints:

POST /prompt - Submit workflow
GET /history - Check job status
Output files are saved to ComfyUI's configured output directory

安全使用建议

This skill appears to do what it says: it sends TTS jobs to a ComfyUI server and downloads resulting audio. Before installing or running: (1) verify you intend to connect to the configured COMFYUI_HOST/COMFYUI_PORT — default is localhost; avoid pointing it at untrusted public hosts; (2) review the included scripts (scripts/tts.sh) if you have stricter security requirements; (3) be aware generated audio files are referenced by the ComfyUI output directory and the script may download files to paths you supply; (4) note the SKILL.md mentions environment variables (COMFYUI_HOST, COMFYUI_PORT, COMFYUI_OUTPUT_DIR) but the registry metadata did not declare them — set these explicitly as needed. If you plan to run against a remote ComfyUI instance, ensure that instance is trusted, since the script will send the text you provide to that server.

功能分析

Type: OpenClaw Skill Name: comfyui-tts Version: 1.0.0 The `scripts/tts.sh` file contains significant vulnerabilities. Several parameters (e.g., `--character`, `--style`, `--model`) are directly interpolated into the JSON workflow sent to the ComfyUI API without proper sanitization, leading to a JSON injection vulnerability. An attacker controlling these inputs could inject arbitrary JSON into the ComfyUI prompt. Additionally, the `--output` argument is used directly for file download and directory creation, posing a path traversal vulnerability that could allow writing files to arbitrary locations on the agent's filesystem. While these are critical flaws, there is no clear evidence of intentional malicious behavior (e.g., data exfiltration, backdoor installation) within the provided code, classifying it as suspicious due to the exploitable vulnerabilities.

能力评估

✓ Purpose & Capability

Name/description (ComfyUI TTS) match the delivered artifacts: two shell scripts implement submitting a workflow to ComfyUI, polling /history, and retrieving audio. Required binaries (curl, jq) are reasonable for the stated purpose.

ℹ Instruction Scope

Runtime instructions and the scripts focus on contacting the ComfyUI endpoints (/prompt, /history, /view) and handling audio files. Minor inconsistency: SKILL.md documents environment variables (COMFYUI_HOST, COMFYUI_PORT, COMFYUI_OUTPUT_DIR) that are used by the scripts but were not declared in the skill's registry 'required env vars' metadata — this is informational, not evidence of hidden behavior. The scripts do not read unrelated system files or transmit data to third-party hosts beyond the configured COMFYUI_URL.

✓ Install Mechanism

No install spec; this is instruction-only with included shell scripts. No downloads or remote installers are used, so nothing arbitrary is fetched or written by the skill itself during installation.

ℹ Credentials

The skill requests no credentials and only uses optional environment variables for the ComfyUI host/port/output. The lack of declared required env vars in registry metadata is a small mismatch with SKILL.md but not disproportionate: the env vars merely point the script at a ComfyUI server and do not grant access to unrelated services or secrets.

✓ Persistence & Privilege

always is false and the skill does not request persistent/privileged system changes. It does not attempt to modify other skills or global agent settings.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install comfyui-tts
安装完成后，直接呼叫该 Skill 的名称或使用 /comfyui-tts 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

Initial release of comfyui-tts – generate speech audio with ComfyUI Qwen-TTS. - Provides a shell script to convert text to speech using ComfyUI's Qwen-TTS service. - Supports customizable options: character, style, model size, output path, and sampling parameters. - Requires curl and jq; configurable via environment variables for host, port, and output directory. - Automatically submits jobs to ComfyUI, monitors completion, and retrieves audio files. - Includes usage instructions, troubleshooting tips, and API endpoint references.

元数据

Slug comfyui-tts

版本 1.0.0

许可证 —

累计安装 2

当前安装数 2

历史版本数 1

常见问题

ComfyUI TTS 是什么？

Convert text to speech audio via ComfyUI's Qwen-TTS API, supporting customizable voice, style, model, and output options. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 908 次。

如何安装 ComfyUI TTS？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install comfyui-tts」即可一键安装，无需额外配置。

ComfyUI TTS 是免费的吗？

是的，ComfyUI TTS 完全免费（开源免费），可自由下载、安装和使用。

ComfyUI TTS 支持哪些平台？

ComfyUI TTS 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 ComfyUI TTS？

由 YHSI5358（@yhsi5358）开发并维护，当前版本 v1.0.0。