← 返回 Skills 市场
genortg

Genor-Comfy-Gate

作者 Krzysztof · GitHub ↗ · v2.0.0 · MIT-0
cross-platform ⚠ suspicious
31
总下载
0
收藏
0
当前安装
2
版本数
在 OpenClaw 中安装
/install genor-comfy-gate
功能描述
Comprehensive multi-modal gateway for ComfyUI enabling audio generation with ACE-Step 1.5 and photorealistic image creation via SDXL workflows.
使用说明 (SKILL.md)

Genor-Comfy-Gate — Comprehensive Skill

THE authoritative reference for ALL ComfyUI operations through our gateway. Multi-modal: audio, images, video (future). Read this before any generation. Updated as we learn.

Modalities

Type Status Workflow Model
🎵 Audio ✅ Active acestep-rapcore ACE-Step 1.5 SFT merge
🖼️ Image ✅ Active lustify-sdxl LUSTIFY SDXL
🎬 Video 🔜 Planned

The gateway is modality-agnostic — it submits any workflow JSON to ComfyUI, polls, waits, downloads, and saves. Adding a new modality means adding a workflow file + WORKFLOW_INFO entry. The type field determines output dir (audio/ or images/).

Gateway

Property Value
Endpoint http://127.0.0.1:8188
Auth x-api-key: gcg-4d... header (localhost exempt)
Managed by pm2 (genor-comfy-gate)
Location ~/projects/Genor-Comfy-Gate/
Config server.js (inline SERVERS array)

Backend Servers

ID URL GPU VRAM Priority
pri http://100.125.137.96:8169 RTX 3090 24GB ★ PRIMARY
sec http://100.80.161.74:8169 RTX 3080 Laptop 16GB Secondary

Load Balancing Logic (in pickServer())

  1. PRIMARY always preferred when IDLE (0 running tasks)
  2. If PRIMARY has ANY running task → ALL new requests → SECONDARY
  3. If SECONDARY offline → fallback to PRIMARY regardless
  4. Download ALWAYS from the server that generated the file (server.url)

Workflows

acestep-rapcore — ACE-Step 1.5 Audio Generation

Model: aceStep15Music_sft17BAIO.safetensors (ACE-Step 1.5 SFT merge)

Workflow Pipeline:
  CheckpointLoader(160) → AnySwitch(model/clip/vae) → TextEncode(94) → KSampler(35 steps, dpmpp_3m_sde, beta, cfg=1) → VAEDecodeTiled → SaveAudioMP3(104)
  Lyrics: String(252) → TextEncode.lyrics
  Duration: mxSlider(274) → TextEncode + EmptyLatent
  Negative: ConditioningZeroOut(47) → zeroes the positive conditioning

Node Map

Node Class Role Injections
94 TextEncodeAceStepAudio1.5 Main text encoder prompttags, lyrics ← 252, bpm, keyscale, duration ← 274, language
252 String Lyrics feed into node 94 lyricsString
3 KSampler Denoising (35 steps, dpmpp_3m_sde, beta, cfg=1) seed ← 307
98 EmptyAceStep1.5LatentAudio Creates latent audio space seconds ← 274
104 SaveAudioMP3 Output V0 MP3
128 VAEDecodeAudioTiled VAE decode (tile=512, overlap=64)
160 CheckpointLoaderSimple Loads model
274 mxSlider Song duration (seconds) durationXi and Xf
307 Seed (rgthree) Global seed seedseed
257 Text Concatenate Builds output filename artist+title+path
47 ConditioningZeroOut Negative prompt (zeroed)
78 ModelSamplingAuraFlow Shift=13 Bypassed by default — use model_sampling: true to enable

Reference Nodes (informational, in workflow but not connected)

Node Content
317 Genre description table (38 genres with tags)
318 Keyscale/BPM reference table (38 genres × scale + key + BPM)
320 Structure example (metalcore duet with timeline)
321 Preset example (detailed scene-by-scene prompt)
319 LLM input example (NSFW lyrics prompt format)
400 Disconnected tags node (original rapcore tags, kept for reference)

Generation Parameters

{
  "workflow": "acestep-rapcore",
  "prompt": "comma-separated tags (under 512 chars)",
  "lyrics": "structured lyrics with [section] tags",
  "duration": 180,
  "bpm": 150,
  "keyscale": "E minor",
  "language": "en",
  "seed": -1
}

All parameters EXCEPT prompt and lyrics are optional. Omitted parameters keep their workflow defaults.

model_sampling (optional, boolean): Enables ModelSamplingAuraFlow (shift=13) for acestep-aio. Bypassed by default — it's 50/50 whether it improves quality, so safer to leave off. Set model_sampling: true if you want to experiment with it on.


lustify-sdxl — Image Generation

Model: LUSTIFY SDXL NSFW photorealistic
Sampler: LCM, 4 steps, cfg=1
Output: PNG

{
  "workflow": "lustify-sdxl",
  "prompt": "photo of...",
  "aspect_ratio": "896x1152",
  "seed": -1
}

Supports GET /generate for UI options form.


Caption Engineering (ACE-Step)

The 8 Dimensions

Every caption should cover as many as possible, in 5-8 comma-separated tags:

  1. Style/Genre — metalcore, synthwave, drum and bass, pop, folk
  2. Emotion/Atmosphere — melancholic, euphoric, aggressive, dreamy, dark
  3. Instruments — distorted guitar, 808 bass, strings, piano, synths
  4. Timbre/Texture — warm, crisp, punchy, lush, airy, bright
  5. Vocal — male/female, raspy, clean, powerful, breathy, belting
  6. Production — polished, lo-fi, live, studio, dry, glossy
  7. Era — 80s, 90s, modern, retro, vintage
  8. Speed/Rhythm — driving, groovy, frantic, mid-tempo, laid-back

Rules

  • 5-8 tags max — more degrades quality
  • BPM/key in parameters, NOT caption — they're separate fields
  • No conflicting pairs — e.g. "classical strings" + "death metal growls"
  • Texture words matter heavily — they control mix/production quality
  • Specific > vague — "melancholic piano ballad, female breathy vocal" > "sad song"
  • Repeat what you want more of — repetition reinforces

Known Good Captions

pop, piano+strings+guitar, female warm vocal, melancholic intimate, bedroom pop
rock, metal, heavy distorted guitar, powerful drums, melodic vocals, aggressive, epic, dramatic, guitar solo
heavy distorted guitar, fast thrash drums, pounding bass, aggressive, dark
rapcore metal fusion, nu-metal, punchy bass, warm distorted guitar, crisp drums, melodic chorus, heavy grooves, atmospheric, polished production, angsty female vocal, emotional

Tags That Cause Problems

  • raw, gritty, distorted (without balancing warmth) → metallic scraping, flat bass
  • heavy bass → boomy/muddy; prefer punchy bass, deep sub-bass, defined bass
  • aggressive on instruments → harsh overtones; use on emotion/vocal instead
  • Too many instrument tags → cluttered, muddy mix
  • "classical" + any heavy genre → contradictory, degrades both

Texture Word Guide

Word Effect
warm Analog-style saturation, smooth high end
crisp Clean transients, defined attacks
punchy Tight, compressed low-mids, good for bass/kick
bright Boosted highs, airy presence
lush Wide stereo, rich harmonics, reverb-heavy
dry Close-mic sound, minimal reverb
airy Spacious high end, breathy
polished Studio-quality, balanced EQ
raw USE WITH CAUTION — unprocessed, potentially harsh
gritty USE WITH CAUTION — distortion artifacts

Lyrics Engineering (ACE-Step)

Required Structure Tags

ACE-Step REQUIRES section markers to align music with lyrics:

[Intro], [Verse], [Pre-Chorus], [Chorus], [Bridge], [Build], [Drop],
[Breakdown], [Guitar Solo], [Piano Interlude], [Outro]

Vocal Control Tags (on own line inside sections)

[whispered], [raspy vocal], [powerful belting], [spoken word],
[falsetto], [harmonies], [clean vocal]

Energy Tags (on own line inside sections)

[high energy], [low energy], [building energy], [euphoric],
[melancholic], [dreamy], [aggressive]

Lyric Writing Rules

  • 6-10 syllables per line — fits the 5Hz LM planner
  • Natural phrasing — write like human speech, not poetry
  • Avoid AI clichés: "neon skies", "electric hearts/dreams", "breaking chains", "rising up", "fire inside"
  • Section description hints on intro/outro lines: (bass rumbles in), (drums fade to silence)
  • UPPERCASE = shouted/emphasized
  • (parentheses) = background vocals/harmonies

🔴 OBOWIĄZKOWA CHECKLISTA PRZED WYSŁANIEM TEKSTU DO GENERACJI

Zanim wyślesz jakikolwiek tekst do ACE-Step — musisz odpowiedzieć sobie na każde z tych pytań i nie wysłać dopóki wszystkie nie są "TAK":

  1. „Czy ten tekst ma sens?” — czy opowiada spójną historię? Czy ma flow od intro do outro? Czy sekcje łączą się logicznie?
  2. „Czy jest gramatycznie poprawny?” — bez błędów ortograficznych, interpunkcyjnych, składniowych. Sprawdź szczególnie polskie znaki, odmianę, przecinki.
  3. „Czy pasuje do autora/projektu?” — czy ton, styl, przekleństwa, energia pasują do artysty (KOSTI/Bonnie Bones)? Czy brzmi jak ta postać?
  4. „Czy muzyka i jej kolejność ma sens?” — czy struktura (Intro→Verse→Chorus→Verse→Bridge→Chorus→Outro) jest logiczna? Czy energia rośnie i opada naturalnie? Czy długość ogólnie ma sens (~120-180s)?
  5. „Czy duration jest odpowiednie?” — 120-180 sekund standard. NIGDY nie wysyłaj duration=150 jeśli nie sprawdziłeś że tyle ma być.
  6. „Czy wiek autora brzmi wiarygodnie?” — nie pisz „mam 15 lat”, „young girl”, „teen” w tekstach dorosłych artystów. KOSTI/Bonnie Bones to dorośli wykonawcy.

Dopiero gdy na każde pytanie odpowiedź brzmi TAK — możesz wysłać do generacji.

Energy Flow Pattern

Intro       → [low energy]       — sparse, building
Verse 1     → [low energy]       — verse, storytelling, restrained
Pre-Chorus  → [building energy]  — tension rising
Chorus      → [high energy]      — maximum impact, full instrumentation
Verse 2     → [low energy]       — second verse, slightly more energy
Pre-Chorus  → [building energy]
Chorus      → [high energy]      — second chorus often bigger (harmonies)
Bridge      → [low energy]       — stripped back, different perspective
Breakdown   → [high energy]      — instrumental intensity (optional)
Final Chorus→ [high energy]      — biggest version
Outro       → [low energy]       — fade out

Genre Reference (from workflow node 317)

Key Genres & Their Tags

Electronic

  • EDM/House: four-on-the-floor, bright synths, uplifting, dance-driven, glossy production, rhythmic, energetic
  • Techno: mechanical, hypnotic rhythms, minimalistic, pulsing bass, industrial textures, dark, repetitive
  • Trance: euphoric, soaring leads, emotional pads, rolling basslines, uplifting, spacious, melodic, anthemic
  • Drum & Bass: rapid breakbeats, deep sub-bass, high-energy, sharp percussion, rolling rhythms, crisp, driving
  • Dubstep: heavy bass drops, wobbling synths, aggressive textures, syncopated rhythms, dark, cinematic, gritty
  • Future Bass: shimmering chords, side-chained synths, emotional, bright leads, bouncy rhythms, glossy, melodic
  • Trap: booming 808s, sharp hi-hats, atmospheric pads, swaggering, dark, punchy, spacious

Rock/Metal

  • Classic Rock: crunchy guitars, steady drums, warm analog tone, energetic, melodic, vintage, riff-driven
  • Hard Rock: heavy riffs, powerful drums, gritty vocals, aggressive, energetic, distorted, bold, driving
  • Metal: distorted guitars, fast drums, dark atmosphere, aggressive, heavy, intense, powerful, tight
  • Progressive Metal: complex structures, technical riffs, atmospheric layers, dramatic, epic, polished, dynamic

Urban

  • Boom Bap: dusty drums, soulful samples, rhythmic, warm textures, punchy kicks, nostalgic, organic
  • Lo-Fi Hip-Hop: mellow beats, vinyl crackle, soft keys, relaxed, dreamy, warm, minimal, hazy
  • Drill: sliding 808s, haunting melodies, gritty textures, cold atmosphere, syncopated, tense, urban

Pop

  • Pop: catchy hooks, bright synths, polished production, upbeat, melodic, modern, radio-ready, clean
  • Synth-Pop: retro synths, bright pads, melodic, nostalgic, electronic, polished, dreamy, airy
  • K-Pop: glossy production, bright synths, genre-blending, catchy hooks, polished, theatrical, vibrant

Soft/Ambient

  • Ambient: soft pads, atmospheric textures, spacious, minimal, calm, evolving, dreamy, subtle, meditative
  • Cinematic: sweeping strings, dramatic percussion, epic, emotional, grand, polished, powerful

Keyscale & BPM Reference (from workflow node 318)

Genre Scale Key Range BPM Range
EDM/House Minor, Dorian D#m–Am 120–128
Techno Phrygian, Minor Fm–A#m 125–135
Trance Major, Mixolydian A–D 130–142
Drum & Bass Minor, Dorian Em–Gm 170–178
Dubstep Minor, Phrygian Fm–G#m 138–150
Future Bass Major, Minor C–F 140–160
Trap Harmonic Minor Fm–Am 130–150
Hip-Hop Minor, Dorian Dm–Gm 85–95
Lo-Fi Dorian, Lydian Cm–Fm 60–85
Pop Major, Mixolydian C–G 90–130
Classic Rock Minor Pentatonic Em–Am 100–140
Hard Rock Minor, Phrygian Em–Gm 120–160
Metal Phrygian, Harmonic Minor Dm–F#m 140–200
Prog Metal Dorian, Melodic Minor C#m–F#m 120–180
Blues Blues Scale, Minor Pentatonic Em–Am 70–120
Funk Mixolydian, Dorian E–A 100–120
Disco Mixolydian, Major F–Bb 110–130
R&B Dorian, Minor Dm–Gm 60–100
Ambient Lydian, Dorian C–F 60–90
Cinematic Minor, Harmonic Minor Cm–Fm 60–120
Reggae Major, Mixolydian A–D 70–90
K-Pop Major, Minor C–F# 100–140
Anime OST Lydian, Major C–E 80–160

Structure Planning (from workflow node 320)

The workflow includes an example of how to structure a caption WITH a song structure plan:

metalcore, symphonic elements, theatrical, duet, heavy distorted guitar,
bright piano, studio-polished, dramatic, melodic, epic, intense.

Structure:
- Intro: brief intro dramatically builds to first verse
- Verse 1: atmospheric piano, sets scene, raspy male vocal only
- Verse 2: guitar power chords, groovy, young female vocal only
- Chorus: anthemic, layered, male+female duet harmonies
- Bridge: atmospheric, dreamy, calm, female vocal only
- Build-up: builds to epic instrumental solo
- Instrumental: fast guitar solo, lead licks, virtuoso shred
- End: powerful ending

This can go in the caption to give the model a temporal roadmap.


Scene-by-Scene Prompting (from workflow node 321)

For maximum control, describe each section's instrumentation and mood in prose:

Intro: A metalcore-tinged, symphonic swell opens the track, with bright piano glimmering
over theatrical strings. Tension rises—studio-polished, dramatic—until it snaps into verse.

Verse 1: Drops to atmospheric piano, soft but charged. Raspy male vocal, intimate, whispered.
No guitars—just piano, subtle pads, suspended breath.

Verse 2: Guitar power chords crash in, groovy pulse. Young female vocal, bright and soaring.
Symphonic elements widen the space, cinematic lift.

Chorus: Erupts into anthemic, epic chorus. Male+female duet harmonies. Distorted guitars,
sweeping strings, pounding drums—polished, intense.

Bridge: Everything falls away. Dreamy, atmospheric, weightless. Soft pads, distant piano,
female vocal airy and ethereal. Suspended.

Build-up: Rhythmic pulses return. Low strings, tom rolls, rising synths. Guitars re-enter
in bursts. Energy coils toward instrumental break.

Instrumental: Fast guitar solo, virtuoso shred, rapid licks, melodic flourishes.
Symphonic backing, metalcore precision drums. Flashy, intense, climactic.

Full API Reference

Core Endpoints

Method Path Description
GET / Health check + server statuses
GET /workflows List available workflows with types
POST /generate-and-wait PRIMARY — submit, wait, download, save. Use this for all generation.
POST /prompt Submit workflow, return prompt_id
GET /history/:prompt_id Get single prompt result
GET /history Aggregated history from all servers
GET /queue Aggregated queue (running + pending)
GET /view Proxy media file download
GET /system_stats First alive server system info
GET /object_info Proxy to ComfyUI object_info
GET /extensions Proxy to ComfyUI extensions

Image Generation (legacy, use generate-and-wait instead)

Method Path Description
GET /generate Get generation options form
POST /generate Submit image generation
POST /upload/image Upload image to ComfyUI input dir

Media Management

Method Path Description
GET /media-list List generated files (name, size, date, preview URLs)
POST /media-link-once Create one-time access token for a file
GET /media-once/:token Access file via one-time token (no API key needed)

Workflow Injection

Method Path Description
POST /workflow/:name/prompt Quick prompt submit for named workflow (auto-injects)

POST /generate-and-wait — Full Reference

curl -s -X POST http://127.0.0.1:8188/generate-and-wait \
  -H "Content-Type: application/json" \
  -d '{
    "workflow": "acestep-rapcore",
    "prompt": "...",
    "lyrics": "...",
    "duration": 200,
    "bpm": 150,
    "keyscale": "E minor",
    "language": "en",
    "seed": -1
  }'

Audio params: prompt (required), lyrics, duration, bpm, keyscale, language, seed
Image params: prompt (required), aspect_ratio, seed, steps, cfg
Common: workflow (default: acestep-rapcore), client_id

Success response:

{
  "status": "ok",
  "file": "/home/genorbox1/.openclaw/workspace/media/comfy/audio/acestep-rapcore_2026-05-19T13-26-54_028.mp3",
  "filename": "acestep-rapcore_2026-05-19T13-26-54_028.mp3",
  "type": "audio",
  "server": "sec",
  "workflow": "acestep-rapcore",
  "file_size": 5882890
}

Output saved with metadata sidecar (.json) in ~/media/comfy/\x3Caudio|images>/.


Operational Notes

Restart

pm2 restart genor-comfy-gate
pm2 logs genor-comfy-gate --lines 20

Status Check

curl -s http://127.0.0.1:8188/ | python3 -m json.tool
curl -s http://127.0.0.1:8188/queue | python3 -m json.tool

Media Location

~/media/comfy/audio/    — generated MP3 files + .json sidecars
~/media/comfy/images/   — generated PNG files + .json sidecars

Gateway Behavior

  • Submits workflow JSON with injected parameters
  • Polls /history/:prompt_id every 2s until complete/fail/timeout
  • Timeout: 600s (10 min) per generation
  • After completion: waits 3s for file write, then downloads
  • Saves to media dir with timestamped name + incrementing sequence number
  • Metadata sidecar written alongside media file

Growing Our Knowledge

When we discover new caption patterns, texture word effects, or workflow tricks:

  1. Update this SKILL.md
  2. Note the date and what we learned in CHANGELOG.md (next to this skill)

Lessons Learned

lustify-sdxl — Image Generation Deep Dive

Model: lustifySDXLNSFW_ggwpV7.safetensorsIllustrious-based SDXL checkpoint Tag system: Danbooru-style tags, NOT natural language Sampler: LCM, 12 steps, scheduler=exponential, cfg=1 Output: PNG via SaveImage (node 200) + PreviewImage (node 87)

Full Pipeline

CheckpointLoader(43) → LoRA stack(47,80) → Resolution(17) → KSampler(7, 12 steps LCM) →
  UltimateSDUpscale(88, 2x, 4x-UltraSharp) →
  FaceDetailer NIP(97) → FaceDetailer V(98) → FaceDetailer P(101) →
  FaceDetailer face(104, 1024px, 6 steps) → FaceDetailer hands(105, 2048px, 6 steps) →
  SeedVR2VideoUpscaler(114, 2048px final) → CRT Post-Process(115) → SaveImage(200)

Active LoRAs (node 80)

LoRA Strength Purpose
AddMicroDetails v6 0.2 Skin texture, fine details
PersonEnhanceV2 ILL 0.1 Better anatomy/face
TrendCraft Style Detailer v2.4I 0.1 Overall polish/detail

Active LoRAs (node 47)

LoRA Strength Purpose
DTLVVTT DMD2 V5-LITE 1.0 DMD2 distillation (faster/better LCM)

FaceDetailer Pipeline

Sequential detailers with YOLO detectors:

  1. NIP (nipples_yolov8s-seg.pt) — nipple detection, 1024px, denoise 0.4
  2. V (nsfw-seg-vagina-x.pt) — vagina detection, 1024px, denoise 0.4
  3. P (nsfw-seg-penis-x.pt) — penis detection, 1024px, denoise 0.4
  4. Face (Anzhc Face seg 768MS v2 y8n.pt) — face detection, 1024px, 6 steps, denoise 0.4
  5. Hands (PitHandDetailer-v2-Test-v9c.pt) — hand detection, 2048px, 6 steps, denoise 0.5

SeedVR2 Upscaler (node 114)

  • Model: seedvr2_ema_7b_sharp-Q4_K_M.gguf (quantized 7B)
  • VAE: ema_vae_fp16.safetensors
  • Final resolution: 2048
  • Color correction: lab

CRT Post-Process (node 115)

  • Vibrance: +0.015 (subtle saturation boost)
  • Vignette: 0.5 strength, 0.7 radius, 2.0 softness

Danbooru Tag Prompting (LUSTIFY)

CRITICAL: LUSTIFY is Illustrious-based — use Danbooru-format tags, NOT natural language descriptions.

Quality/Priority Tags (always include)

masterpiece, best quality, amazing quality, very aesthetic, absurdres

Subject Tags

1girl, solo, cute, petite, pale skin, medium breasts

Clothing/Accessories

gym uniform, white shirt, sports shorts, sneakers, ponytail

Action/Pose (keep it SIMPLE — complex actions confuse the model)

jumping, dynamic pose, looking at viewer

Setting/Light

gym background, afternoon light, dutch angle, from below

Negative Prompt (always)

blurry, worst quality, bad quality, error, melted body, bad anatomy, bad hands, disfigured

What Works

  • Character portraits work best — this is a hentai/character model
  • Simple dynamic poses (jumping, running, leaning) — YES
  • Quality tags firstmasterpiece, best quality are weighted
  • POV/camera tagsdutch angle, from below, from above, close-up
  • Lighting tagssunlight, god rays, afternoon light, backlight
  • Keep tags under ~25 — more dilutes quality

What Fails

  • Natural language descriptions — "mid-jump over a vaulting horse" → model doesn't understand
  • Complex multi-object composition — "vaulting horse + girl midair" = garbled anatomy
  • "photorealistic" tag — fights the anime/illustrious base, produces uncanny results
  • Overloaded action tags — "jumping + spread legs + leaning forward + vaulting horse" = nightmare
  • Multiple characters — this workflow is tuned for 1girl, solo

Image Generation Parameters

{
  "workflow": "lustify-sdxl",
  "prompt": "masterpiece, best quality, 1girl, cute, ...",
  "aspect_ratio": "7:9 (Portrait)",
  "seed": -1
}

Valid aspect ratios:

  • 1:1 (Square)
  • 4:5 (Portrait)
  • 7:9 (Portrait) ← default, best for single character
  • 3:2 (Landscape)
  • 16:9 (Landscape)
  • 9:16 (Portrait)

Additional optional params: megapixels (default 1.5), steps, cfg, denoise, sampler_name, scheduler

Adding a New Workflow (any modality)

  1. Export workflow JSON from ComfyUI → save to workflows/\x3Cname>.json
  2. Add entry to WORKFLOW_INFO in server.js:
    '\x3Cname>': { file: '\x3Cname>.json', type: 'audio'|'image'|'video', ext: 'mp3'|'png'|'mp4',
                promptNode: '94', promptField: 'tags', lyricsNode: '252', lyricsField: 'String',
                outputNode: '104' }
    
  3. Restart: pm2 restart genor-comfy-gate
  4. Test, then document in this SKILL.md

The gateway auto-handles: prompt injection, duration, BPM/keyscale (audio), aspect_ratio (image), seed, polling, download from correct server, save to media dir, metadata sidecar.

Lessons Learned

2026-05-19 — Image Generation

  • LUSTIFY is Illustrious-based, uses Danbooru tags — natural language prompts produce garbled results
  • Quality tags (masterpiece, best quality) must come FIRST — they're weighted
  • Complex action scenes fail — model is trained for character portraits, keep poses simple
  • "photorealistic" tag on anime model = uncanny valley, avoid
  • Keep prompts under 25 tags — overloading dilutes quality
  • Pipeline has SeedVR2 upscaler (7B GGUF) + 5-stage FaceDetailer → 2048px final output
  • Face/hand detailers produce excellent close-up quality

2026-05-19 — Audio Generation

  • Download 400 bug: getOutputInfo() function returned undefined filenames despite reading them from history correctly. Fixed by inlining output scanning in the handler.
  • Load balancer: PRIMARY-first when idle, ALL→SECONDARY when PRIMARY busy (not round-robin).
  • Workflow cleanup: Removed duplicate nodes 401, 402. Lyrics now go through node 252 (String) → node 94.
  • Caption quality: raw, gritty, heavy drops cause metallic scraping and flat bass. Use warm, crisp, punchy, polished for clean instruments.
  • 5-8 tags sweet spot for SFT merge model. More degrades quality.
  • 8 dimensions matter: Missing emotion/timbre = flat results. Cover: genre, emotion, instruments, timbre, vocal, production, era, rhythm.
安全使用建议
Install only if you intend to run a persistent local media gateway. Before use, set a strong API_KEY, verify which backend ComfyUI servers will receive prompts, restrict the service to trusted local clients, review or remove MCP restart/upload tools, and treat generated media links as bearer links rather than true one-time private links.
能力标签
cryptorequires-oauth-tokenrequires-sensitive-credentials
能力评估
Purpose & Capability
The stated purpose, a ComfyUI/OpenAI media gateway, matches much of the code, including generation, workflow management, and media storage. The concern is that the same gateway also exposes high-impact controls: MCP service restart, raw workflow execution, workflow upload, workflow deletion, tokenized media access, and default routing to hardcoded backend server addresses.
Instruction Scope
The instructions disclose many endpoints, but the sensitive surfaces are under-scoped: some workflow and media listing endpoints are unauthenticated, MCP clients can invoke restart and workflow upload after only gateway auth, and auth silently falls back to a hardcoded published API key if API_KEY is unset. The docs also mention GCG_API_KEY, while the code checks API_KEY, increasing the chance of an insecure default deployment.
Install Mechanism
The installer runs npm install, installs PM2 globally if missing, starts the gateway under PM2, and saves the PM2 process list. That persistent, host-level setup is not hidden, but it is high-impact for a skill package and lacks an explicit opt-in confirmation.
Credentials
Writing generated media and metadata sidecars is expected for this gateway, but prompts and generation metadata can persist locally, generated requests are sent to hardcoded HTTP backend servers by default, and tokenized media URLs grant access to anyone holding the link until expiry.
Persistence & Privilege
PM2 persistence, a background queue monitor, and an MCP-exposed restart command give the skill ongoing process-control authority beyond an ordinary on-demand skill. This is not evidence of malicious intent, but it requires clearer scoping and user control.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install genor-comfy-gate
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /genor-comfy-gate 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v2.0.0
v2.0.0 — MAJOR: Bundled entire MCP server into skill package! Now includes server.js, lib/, install.sh, pm2 config. Install with clawhub and run directly. No more separate project checkout needed. Removed lustify-sdxl (deprecated). Personal data audit completed. Environment config via env.example.
v1.1.0
v1.1.0 — Personal data audit: removed hardcoded paths, API key from console logs, hostname from fallbacks. Fixed MCP restart handler (ESM compat). Made MEDIA_DIR configurable via env. Added env.example.
元数据
Slug genor-comfy-gate
版本 2.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 2
常见问题

Genor-Comfy-Gate 是什么?

Comprehensive multi-modal gateway for ComfyUI enabling audio generation with ACE-Step 1.5 and photorealistic image creation via SDXL workflows. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 31 次。

如何安装 Genor-Comfy-Gate?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install genor-comfy-gate」即可一键安装,无需额外配置。

Genor-Comfy-Gate 是免费的吗?

是的,Genor-Comfy-Gate 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Genor-Comfy-Gate 支持哪些平台?

Genor-Comfy-Gate 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Genor-Comfy-Gate?

由 Krzysztof(@genortg)开发并维护,当前版本 v2.0.0。

💬 留言讨论