← 返回 Skills 市场
halr9000

ComfyUI ImageGen (Flux2)

作者 halr9000 · GitHub ↗ · v1.5.0
cross-platform ⚠ suspicious
1426
总下载
0
收藏
0
当前安装
3
版本数
在 OpenClaw 中安装
/install comfyui-imagegen
功能描述
Generate images via ComfyUI API (localhost:8188) using Flux2 workflow. Supports structured JSON prompts sent directly as positive prompt parameter, seed/steps customization. Async watcher via sub-agent for low-latency, token-efficient polling (every 5s).
使用说明 (SKILL.md)

ComfyUI ImageGen\

\

Changelog\

  • [2026-02-11 20:42 EST]: v1.5.0 (published) - Standalone --structured-prompt fix (no positional arg required); workflow updated to 1920x1080 16:9; production-ready with live tests (JSON direct-to-positive-prompt, async sub-agent delivery).\
  • [2026-02-11 10:10 EST]: v1.4.0 - Refactored prompting: agent converts human prompt to structured JSON string; script sends JSON directly as ComfyUI positive prompt (no prose conversion). Added text-in-image quoting rule.

Changelog

  • [2026-02-11 10:10 EST]: v1.4.0 - Refactored prompting: agent converts human prompt to structured JSON string; script sends JSON directly as ComfyUI positive prompt (no prose conversion). Added text-in-image quoting rule.
  • [2026-02-11 09:52 EST]: v1.3.0 - Script now accepts --structured-prompt JSON directly (auto-generates prose internally). Removed agent-side prose step. Updated usage/examples.
  • [2026-02-11 09:20 EST]: v1.2.0 - Added structured prompt parsing: translate human requests into optional JSON schema fields, auto-generate optimized Flux.2 prose prompts. Updated description.
  • [2026-02-11 00:15 EST]: v1.1.0 - Added --submit-only (fast prompt_id return) + --watch prompt_id modes. SKILL.md docs async flow: submit → sessions_spawn watcher sub-agent (polls every 5s, auto-sends image to Telegram, ~10x token savings vs. main-agent block).
  • [2026-02-11 04:05 EST]: v1.0.3 - Added Flux.2 [klein] 9B prompting best practices section with guidelines from Black Forest Labs and fal.ai.
  • [2026-02-10 23:05]: v1.0.2 - Workflow v3: "Image Save with Prompt/Info (WLSH)" (node 85) for improved metadata embedding. Script updated to poll node 85.
  • [2026-02-10 23:00]: Added explicit instruction to always send generated image to user post-generation via message tool.
  • [2026-02-10]: Updated to new workflow: JPG output with embedded prompt/metadata via "Image Save with Prompt File (WLSH)". Model changed to darkBeastFeb0826Latest_dbkBlitzV15.safetensors. Script now polls node 84.

Usage

  1. Agent converts human request to mandatory structured JSON schema → compact string.

  2. Verify ComfyUI runs on localhost:8188.

  3. Async Mode (Recommended):

    1. exec python skills/comfyui-imagegen/scripts/generate.py --structured-prompt '{"subjects":[{"description":"fluffy tabby cat","position":"center","action":"sitting relaxed"}],"scene":"cozy room interior","lighting":"warm golden sunset rays","mood":"serene and peaceful","camera":{"angle":"low angle"}}' --submit-only → parse prompt_id
    2. sessions_spawn task="Set-Location 'C:\\Users\\hal\\.openclaw\\workspace'; python skills/comfyui-imagegen/scripts/generate.py --watch '{prompt_id}' --output ./gen-{prompt_id}.jpg; message action=send channel=telegram target=595846104 media=./gen-{prompt_id}.jpg; Remove-Item ./gen-{prompt_id}.jpg" label="img-{prompt_id}" cleanup=delete runTimeoutSeconds=180
    
    • Watcher polls /history/{prompt_id} every 5s (optimal: \x3C5s latency, ~12 polls max @60s job, isolated tokens).
    • Auto-sends JPG to this chat on completion (sub-agent pings back).
    • Timeout implicitly via spawn runTimeoutSeconds=120.
  4. Sync Mode (blocks agent):

    exec python skills/comfyui-imagegen/scripts/generate.py --structured-prompt '{"scene":"your scene"}' [--seed N] [--steps 10] [--output ./my.jpg] [--host localhost:8188]
    message action=send channel=telegram media=./my.jpg
    
  5. Customize:

    Arg Default Notes
    --seed random Repro
    --steps 5 20-50 quality
    --host localhost:8188 Remote
    --output gen-{seed/pid}.jpg Full path

Structured Prompt Schema (Mandatory Format)

Agent step 1: Convert human natural language request into this exact JSON structure (all fields optional; populate only relevant; subjects array supports multiples).

Rule: For text in images (signs, logos), surround in double quotes within description/action fields, e.g., "sign reading \"STOP\"" or "logo with text \"OpenClaw\""

{
  "scene": "overall scene description",
  "subjects": [
    {
      "description": "detailed subject description",
      "position": "where in frame",
      "action": "what they're doing"
    }
  ],
  "style": "artistic style",
  "color_palette": ["#FF0000", "#00AACC"],
  "lighting": "lighting description",
  "mood": "emotional tone",
  "background": "background details",
  "composition": "framing and layout",
  "camera": {
    "angle": "camera angle",
    "lens": "lens type",
    "depth_of_field": "focus behavior"
  }
}

Agent step 2: Stringify JSON (compact, single-line for shell escaping), pass to script --structured-prompt (sent directly as ComfyUI positive prompt).

Example:

User: "A cat sitting on a windowsill at sunset"

Structured JSON string (for --structured-prompt):

'{"subjects":[{"description":"fluffy tabby cat","position":"center","action":"sitting relaxed"}],"scene":"cozy room interior","lighting":"warm golden sunset rays","mood":"serene and peaceful","camera":{"angle":"low angle"}}'

Workflow Details

  • Polls node 85 ("Image Save with Prompt/Info (WLSH)").
  • Model: darkBeastFeb0826Latest_dbkBlitzV15.safetensors
  • Template: workflows/flux2.json

Prompting Best Practices (Flux.2 [klein] 9B)

  • Prose, not keywords. Subject → Scene → Lighting → Mood.
  • E.g., "A serene mountain lake at dawn, mist rising, golden light piercing peaks, photorealistic."
  • Sources: BFL, fal.ai

Examples

# Async test (structured JSON string → direct positive prompt)
python .../generate.py --structured-prompt '{"subjects":[{"description":"fluffy tabby cat","position":"center","action":"sitting relaxed"}],"scene":"cozy room interior","lighting":"warm golden sunset rays","mood":"serene and peaceful","camera":{"angle":"low angle"}}' --submit-only --steps 10
# → prompt_id=abc123; spawn watcher sub-agent

For cron alternative (less optimal): cron add one-shot at=now+10s payload.systemEvent="Check img job {prompt_id}" but spawn > cron for this.

安全使用建议
This skill appears to do what it says: submit a Flux2 workflow to a ComfyUI HTTP server (default localhost:8188), poll for completion, and download the saved image. Before installing/using: 1) ensure you trust the ComfyUI host you point it at (default is localhost; if you change --host to a remote server, your structured prompts and any metadata will be sent to that server), 2) be aware the SKILL.md examples auto-send images via the agent's message tool (Telegram) — confirm the agent's messaging channels/targets are ones you trust, 3) the example uses a user-specific workspace path for spawned jobs; modify that to a safe directory on your system if you run the watcher, and 4) the script writes downloaded images to disk and removes them in the example — check file paths and permissions you grant the agent. If you need stricter guarantees, run ComfyUI locally and avoid using the example sessions_spawn Telegram send until you verify messaging credentials and targets.
功能分析
Type: OpenClaw Skill Name: comfyui-imagegen Version: 1.5.0 The `scripts/generate.py` script contains vulnerabilities that could be exploited if the OpenClaw agent is compromised via prompt injection. Specifically, the `--host` argument allows specifying an arbitrary network target, enabling Server-Side Request Forgery (SSRF) against internal or external hosts. Additionally, the `--output` argument allows writing files to arbitrary paths on the system. While the skill's stated purpose is benign (image generation), these capabilities, combined with the `sessions_spawn` command in `SKILL.md` which executes a shell command, introduce significant security risks without clear malicious intent from the skill developer.
能力评估
Purpose & Capability
Name/description match the included script and workflow: the Python script posts a Flux2 workflow JSON to a ComfyUI host, polls history, and downloads the saved JPG. There are no unexpected required env vars, binaries, or installers that would be unrelated to image generation.
Instruction Scope
SKILL.md stays largely within image-generation scope, but the examples include orchestration steps (sessions_spawn) that automatically send images via the agent's message tool (Telegram) and use a hardcoded example workspace path (C:\Users\hal\.openclaw\workspace). Those orchestration instructions reach into the agent's messaging/channel capabilities and the local filesystem; they are plausible but are external to core generation and deserve operator awareness.
Install Mechanism
No install spec is provided (instruction-only with a small included script), so nothing is downloaded or executed at install time. This minimizes supply-chain risk.
Credentials
The skill declares no environment variables or credentials, which matches its behavior (it talks to a ComfyUI HTTP host). However SKILL.md demonstrates auto-sending images to Telegram via the agent's message tool but does not declare Telegram credentials — this relies on the agent/runtime having messaging credentials configured. The example also references a user-specific workspace path; if followed, that grants the skill read/write in that folder.
Persistence & Privilege
The skill does not request permanent/always-on inclusion (always:false) and does not modify other skills or global agent settings. It uses a spawned watcher sub-agent in examples, which is normal for async jobs but not a privileged persistent presence.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install comfyui-imagegen
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /comfyui-imagegen 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.5.0
v1.5.0: Standalone structured-prompt fix; 1920x1080 workflow; live-tested async JSON generation + delivery. Full changelog in SKILL.md.
v1.0.2
v1.0.2: Workflow v3 - Image Save with Prompt/Info (WLSH) node 85 for superior metadata embedding. Always send image post-gen.
v1.0.0
Initial release: Flux2 workflow + Python API client for OpenClaw agents. Custom prompts/seeds/steps via localhost:8188. Tested with Jeeves butler scenes.
元数据
Slug comfyui-imagegen
版本 1.5.0
许可证
累计安装 0
当前安装数 0
历史版本数 3
常见问题

ComfyUI ImageGen (Flux2) 是什么?

Generate images via ComfyUI API (localhost:8188) using Flux2 workflow. Supports structured JSON prompts sent directly as positive prompt parameter, seed/steps customization. Async watcher via sub-agent for low-latency, token-efficient polling (every 5s). 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 1426 次。

如何安装 ComfyUI ImageGen (Flux2)?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install comfyui-imagegen」即可一键安装,无需额外配置。

ComfyUI ImageGen (Flux2) 是免费的吗?

是的,ComfyUI ImageGen (Flux2) 完全免费(开源免费),可自由下载、安装和使用。

ComfyUI ImageGen (Flux2) 支持哪些平台?

ComfyUI ImageGen (Flux2) 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 ComfyUI ImageGen (Flux2)?

由 halr9000(@halr9000)开发并维护,当前版本 v1.5.0。

💬 留言讨论