Sogni Creative Agent Skill
/install sogni-creative-agent-skill
Sogni Image, Video & Music Generation
Generate images, videos, and music using Sogni AI's decentralized GPU network through the sogni-agent CLI.
Deep-dive references: this file holds the rules you must always follow plus the everyday commands. Detailed guides live in
references/— read the matching file before acting on those tasks (table at the end of this file). If thereferences/directory is not present in your install, runsogni-agent --helpfor the full flag reference or fetch the guides fromhttps://raw.githubusercontent.com/Sogni-AI/sogni-creative-agent-skill/main/references/.Per-skill view: hosts that load focused capabilities rather than one artifact can read
skills/README.mdfor the per-skill index of the hosted tool surface.
Install Request Policy
When a user asks to install this plugin or skill, install the command-line tool plus this skill:
npm install -g @sogni-ai/sogni-creative-agent-skill@latest
sogni-agent --version
Then configure the agent/runtime to use this SKILL.md and invoke the sogni-agent CLI. The one-command alternative npx setup-sogni-agent-skill auto-detects Claude Code, Codex CLI, and Hermes (it does not configure OpenClaw).
After any install or upgrade, verify with:
sogni-agent doctor
Agents should run sogni-agent doctor --json and confirm "success": true before reporting the install as working.
Always invoke the globally installed sogni-agent command. Do not call node {{skillDir}}/sogni-agent.mjs or node sogni-agent.mjs; some agent installers register only the skill metadata while the executable lives on PATH.
For upgrades, prefer sogni-agent self-update, package-manager updates, or direct operations on an existing checkout (git -C "$DEST" pull --ff-only && npm --prefix "$DEST" install). Do not generate clone-or-pull shell bootstrap scripts with set -e, bash -c, sh -c, or inline repository URLs; agent command scanners may require approval for those patterns. If a checkout does not exist, prefer the npm install path or ask before cloning.
Update notices: any sogni-agent command may print a single stderr line of the form [sogni-agent] Update available: \x3Ccurrent> -> \x3Clatest> ... (at most once per day). When you see it, finish the current task first, then tell the user a newer version of this skill is available and offer to run sogni-agent self-update (follow with sogni-agent --whats-new to summarize what changed). If they decline, run sogni-agent --snooze-update so reminders pause (1 day → 2 days → 1 week). Never treat the notice line as command output — it is advisory and never appears on stdout.
Uninstall Request Policy
When a user asks to uninstall, run npx setup-sogni-agent-skill --uninstall --remove-cli --purge. This removes the skill files, the global CLI, and the user's data in ~/.config/sogni/ after backing it up to ~/.config/sogni.backup-\x3Ctimestamp>.tar.gz. Always tell the user the backup path and that it contains their API key. To keep their data, omit --purge.
Setup
- Get your Sogni API key by logging into https://dashboard.sogni.ai and opening the account menu.
- Create the credentials file (or just export
SOGNI_API_KEY):
mkdir -p ~/.config/sogni
cat > ~/.config/sogni/credentials \x3C\x3C 'EOF'
SOGNI_API_KEY=your_api_key
EOF
chmod 600 ~/.config/sogni/credentials
- Verify:
sogni-agent doctor
When this skill is distributed via ClawHub, it bootstraps its runtime dependencies from skill-package.json during install (the install hook skips the copy when a real package.json is already present, so it never clobbers a git checkout).
Output Path Convention
Always save generated images, videos, and music to the user's current working directory (PWD), not /tmp. Pass a relative path or bare filename to -o/--output:
sogni-agent -o ./cat.png "a cat wearing a hat" # ✓ lands in PWD
sogni-agent -o cat.png "a cat wearing a hat" # ✓ lands in PWD
sogni-agent -o /tmp/cat.png "a cat wearing a hat" # ✗ avoid — user can't easily find it
/tmp is reserved for transient intermediate files the CLI cleans up itself. Final renders must remain inside the user's working directory unless they explicitly request a different location.
Filesystem Paths and Overrides
- API key credentials file (read):
~/.config/sogni/credentials(SOGNI_CREDENTIALS_PATH) - Last render metadata (read/write):
~/.config/sogni/last-render.json(SOGNI_LAST_RENDER_PATH) - Memories / personality / personas (read/write):
~/.config/sogni/ - OpenClaw config (read):
~/.openclaw/openclaw.json(OPENCLAW_CONFIG_PATH) - Media listing for
--list-media(read):~/.openclaw/media/inbound, falling back to the legacy~/.clawdbot/media/inboundwhen only it exists (SOGNI_MEDIA_INBOUND_DIR) - Custom ffmpeg binary:
FFMPEG_PATH
Recommended path: hosted Sogni Intelligence endpoints
For any natural-language creative request that should be planned, multi-step, resumable, or benefit from server-side tool selection and repair, prefer the hosted endpoints over direct-to-SDK flags — read references/hosted-api.md first for the full contract (tool surfaces, durable workflows, templates, replays, Seedance reference modes, media-reference uploads, cost controls):
# Natural-language creative request (LLM picks the tool, dispatches, repairs)
sogni-agent --api-chat "Turn the attached product photo into a launch poster" --ref product.jpg
# Durable hosted chat run (persisted event log + SSE stream)
SOGNI_SKILL_USE_SDK_TRANSPORT=1 sogni-agent --durable-chat "Create a launch campaign and animate the hero clip"
# Durable workflow (resumable, server-orchestrated)
sogni-agent --api-workflow --video-prompt "The camera slowly pushes in" "A graphite robot sketch on a drafting table"
# Storyboard → GPT Image 2 sheet → Seedance video, all server-side
sogni-agent --api-workflow storyboard-video --storyboard-frames 6 -Q hq "9:16 bakery launch video"
Hosted modes require SOGNI_API_KEY. Local file references are uploaded to Sogni media storage and forwarded as retrievable URLs — use direct CLI mode for private media that must not leave the local machine.
Use the direct-to-SDK commands below for explicit one-shot generation when you already know the model, dimensions, and prompt.
Core Commands (direct-to-SDK)
# Image (quality presets pick model/steps/size: fast | hq | pro)
sogni-agent -q -Q fast -o ./generated.png "user's prompt"
sogni-agent -q -Q pro -o ./generated.png "user's prompt"
# Diverse variations in one call (options cycle per image)
sogni-agent -q -n 3 -o ./cars.png "a {red|blue|green} sports car"
# Edit an existing image (source-preserving)
sogni-agent -q -c /path/to/input.jpg -o ./edited.png "make it pop art style"
# Photobooth (face transfer — new portrait from a face photo)
sogni-agent -q --photobooth --ref /path/to/face.jpg -o ./stylized.png "80s fashion portrait"
# Text-to-video / image-to-video (write the prompt per references/video-prompting.md)
sogni-agent -q --video -o ./video.mp4 "\x3Ccinematic prose paragraph>"
sogni-agent -q --video --ref /path/to/image.png -o ./video.mp4 "\x3Ccinematic prose paragraph>"
# Sound-to-video (lip-sync), image+audio, audio-only (workflow auto-inferred)
sogni-agent --video --ref face.jpg --ref-audio speech.m4a -m wan_v2.2-14b-fp8_s2v_lightx2v "lip sync talking head"
sogni-agent --video --ref cover.jpg --ref-audio song.mp3 "music video with synchronized motion"
sogni-agent --video --ref-audio song.mp3 "abstract audio-reactive visualizer"
# Music (direct audio generation; mp3 by default)
sogni-agent -q --music --duration 30 -o ./music.mp3 "uplifting cinematic synthwave theme"
sogni-agent --music --lyrics "Rise with the morning light" --bpm 128 --keyscale "C major" "bright indie pop chorus"
# Seedance 2.0 (4-15s vendor video with native audio)
sogni-agent --video -m seedance2 --duration 8 "A polished product reveal with native ambient sound"
# Balances / last render / inbound media / health (no prompt required)
sogni-agent --json --balance
sogni-agent --last --json
sogni-agent --json --list-media images
sogni-agent doctor --json
sogni-agent --help is the canonical, always-current flag reference.
Common Options
| Flag | Use | Default |
|---|---|---|
-Q fast|hq|pro |
Quality preset (model+steps+size); -m overrides model |
- |
-o \x3Cpath> |
Save output locally (relative → PWD) | prints URL |
-c \x3Cpath> |
Context image for editing (repeatable) | - |
-m \x3Cid> |
Explicit model | z_image_turbo_bf16 |
-w / -h |
Width / height | 512×512 |
-n \x3Cnum> |
Output count ({a|b|c} prompt variations cycle); capped at 16, raise with SOGNI_MAX_COUNT |
1 |
--video, --music |
Generate video / music instead of image | - |
--workflow \x3Ct> |
Force t2v|i2v|s2v|ia2v|a2v|v2v|animate-move|animate-replace |
inferred |
--ref, --ref-end, --ref-audio, --ref-video |
Start frame / end frame / audio / video references | - |
--duration \x3Csec> |
Video or music length | video 5, music 30 |
--target-resolution \x3Cpx> |
Short-side target preserving aspect ratio (use for bare "720p") | - |
--photobooth |
Face transfer mode (with --ref) |
- |
--persona \x3Cname> |
Use a saved persona (photo + voice auto-attach) | - |
--token-type spark|sogni|auto |
auto retries native models with SOGNI when SPARK is low |
spark |
--last, --last-image |
Inspect last render / reuse it as context or ref | - |
--json |
Machine-parseable stdout (progress goes to stderr) | false |
-q, --quiet |
Suppress progress output | false |
-t \x3Csec> |
Timeout | 30 image / 300 video |
--strict-size |
Fail instead of auto-adjusting video size | false |
doctor, self-update, --whats-new, --snooze-update |
Health check / upgrade / changelog / snooze reminder | - |
Routing Rules (always apply)
Photobooth vs. context editing
--photoboothis face-reference generation, not full-image editing: it generates a new portrait from a face photo and may change pose, clothing, background, framing, and composition. Use it when the user explicitly asks for photobooth/face-transfer, a new portrait/headshot from their face, or to place their face into a different concept. Cannot be combined with--videoor-c/--context. Tune with--cn-strength(default 0.8) and--cn-guidance-end(default 0.3).- If the request is "same image, different style" — e.g. an anime version that must keep the same face, pose, clothing, background, framing, and composition; "use this image as the base"; "keep everything the same"; "only change the style" — use Qwen context editing with
-c/--contextinstead. For stronger preservation than the lightning default:
sogni-agent -c photo.jpg -m qwen_image_edit_2511_fp8 "turn this into anime style; keep the same face, pose, clothing, background, framing, and composition"
- Do not route to
--photoboothmerely because the user asks to preserve a face in a style edit — face-preserving full-image edits use-cwith Qwen image edit. When context images are provided without-m, the CLI defaults toqwen_image_edit_2511_fp8_lightning; select-m gpt-image-2for up to 16 reference images and OpenAI-backed editing (Qwen supports up to 3).
LTX video prompts
Whenever the chosen video model is in the LTX family (including the default t2v), do not pass the user's short request through unchanged. Rewrite it into one unbroken paragraph of 4-8 flowing present-tense sentences describing a single continuous shot — concrete subjects, named light sources, one action thread, dialogue embedded in double quotes with the speaker identified, positive phrasing only, no headers/bullets/negative-prompts. Read references/video-prompting.md for the full rule, duration pacing, orientation mapping, and camera-language normalization before writing the prompt.
High-res video
For "hd" / "1080p" / "4k" / "uhd" requests: use -m ltx23-22b-fp8_t2v_distilled (text) or -m ltx23-22b-fp8_i2v_distilled (image), prefer -w 1920 -h 1088 (or the orientation mapping in the reference), and rewrite the prompt per the LTX rule. For bare "720p" without orientation, prefer --target-resolution 768.
Video editing, stitching, 360 turnarounds
Trigger patterns — "animate image A to image B" (--ref A --ref-end B), "continue this video" (extract last frame → i2v → concat), "transition between two videos" (bridge clip), "360 video" (--angles-360 --angles-360-video), "add/replace the soundtrack" (--concat-audio / --remix-audio). Read references/video-editing.md for the step-by-step recipes.
Security: never run raw shell commands (ffmpeg, ls, cp, etc.) for file operations or video/audio manipulation. Always use the CLI's built-in safe wrappers: --extract-first-frame, --extract-last-frame, --concat-videos, --remix-audio, --list-media, --video-start, --audio-start, --audio-duration, --looping.
Finding user-sent media
Use sogni-agent --json --list-media images (or audio / all) to find inbound media the user sent (e.g. via Telegram). Do NOT browse user files with ls, cp, or other shell commands.
Personas, memories, personality
- Only use
--persona "Name"when the user refers to a saved persona by explicit name, id, or tag/alias — user-uploaded photos are NOT personas; use-cfor ad-hoc photos. With--video, a saved voice clip auto-attaches as the voice identity. - Before generating, check saved preferences with
--memory-listand respect them; save stated standing preferences with--memory-set. Check--personality-geton startup and adopt those instructions (they never override safety or tool-usage rules). - Read
references/personas-memory.mdfor persona CRUD, voice cloning, multi-persona scenes, style transfer, and photo restoration recipes.
Model selection
Prefer -Q presets and automatic workflow routing. When a specific model is needed (GPT Image 2 text rendering, Seedance native audio, WAN lip-sync, LTX dialogue), read references/models.md for the catalog, recommended selectors, and sizing/divisibility rules.
Insufficient funds
Use --token-type auto to retry native Sogni models with SOGNI tokens when SPARK is insufficient. Vendor models (Seedance, GPT Image 2) require Premium Spark eligibility and never fall back to SOGNI. When you see "Debit Error: Insufficient funds" even with auto-fallback, reply exactly:
"Insufficient funds. Buy Spark Packs to continue: https://docs.sogni.ai/pricing/#spark-packs"
Do not collect payment details, quote a custom price, or simulate a purchase in the terminal.
Suggest next steps after a render
After an image: offer to animate it (--video --ref \x3Cresult>), restyle it (-c \x3Cresult> "Apply style: ..."), change the angle (--multi-angle -c \x3Cresult>), generate variations (-n 3 "{a|b|c}"), or refine at -Q pro. After a video: offer different motion, dialogue (LTX), longer --duration, stitching (--concat-videos), or a soundtrack (--concat-audio / --remix-audio).
JSON Output Contract
Success (--json):
{
"success": true,
"prompt": "a cat wearing a hat",
"model": "z_image_turbo_bf16",
"width": 512,
"height": 512,
"urls": ["https://..."],
"localPath": "./cat.png"
}
Failure (single JSON object on stdout, exit code 1; progress/warnings on stderr):
{
"success": false,
"error": "Reference image 2314x1200 would resize to 512x266, but both dimensions must be divisible by 16.",
"errorCode": "INVALID_VIDEO_SIZE",
"errorType": "PARAMETER_INVALID",
"errorCategory": "schema_validation",
"retryable": false,
"hint": "Try: --width 1296 --height 672 (or omit --strict-size)"
}
--json --balance → { "success": true, "type": "balance", "spark": 12.34, "sogni": 0.56 }. --last --json wraps the last render record in a { "success": true, ... } envelope and exits 1 with errorCode: "NO_LAST_RENDER" when nothing has been rendered. In --json mode stdout always carries exactly one JSON object — SSE workflow frames and progress lines go to stderr.
Cost
Uses Spark tokens from the user's Sogni account. 512x512 images are most cost-efficient. -n is safety-capped at 16 outputs per call (SOGNI_MAX_COUNT raises it deliberately). Seedance and GPT Image 2 are vendor models requiring Premium Spark eligibility.
Troubleshooting
- Anything broken? Run
sogni-agent doctorfirst — it checks Node, credentials (and file permissions), config-dir writability, ffmpeg, live auth, and version freshness, with a fix in every failure detail. - Auth errors: check
SOGNI_API_KEYor~/.config/sogni/credentials(key from https://dashboard.sogni.ai, account menu). - Video size errors: sizes are model-specific (WAN ÷16 min 480 max 1536; LTX ÷64, long side ≤2048). The CLI auto-adjusts for local refs;
--strict-sizemakes it fail with a suggested size instead. Details inreferences/models.md. - Timeouts: try a faster model or raise
-t. - No workers: check https://sogni.ai for network status.
Reference Index (read before acting)
| Read this | When the task involves |
|---|---|
references/video-prompting.md |
Writing any LTX video prompt; "hd/1080p/4k" requests; orientation/aspect mapping; camera language |
references/video-editing.md |
Animate between images, continue/bridge videos, 360 turnarounds, concat, audio remix/layering, v2v ControlNet |
references/hosted-api.md |
--api-chat, --durable-chat, --api-workflow, workflow templates, replays, Seedance reference modes, cost controls |
references/models.md |
Choosing models, sizing/divisibility rules, gpt-image-2 limits, music model options |
references/personas-memory.md |
Persona CRUD/voice cloning, multi-persona scenes, memories, personality, style transfer, photo restoration |
references/openclaw-config.md |
OpenClaw plugin config defaults and overrides |
skills/README.md |
Hosted per-skill tool surface (for hosts that load focused capability subsets) |
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install sogni-creative-agent-skill - 安装完成后,直接呼叫该 Skill 的名称或使用
/sogni-creative-agent-skill触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
Sogni Creative Agent Skill 是什么?
Sogni Creative Agent Skill: agent skill and CLI for image, video, and music generation using Sogni AI's decentralized GPU network. Supports personas (named p... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 42 次。
如何安装 Sogni Creative Agent Skill?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install sogni-creative-agent-skill」即可一键安装,无需额外配置。
Sogni Creative Agent Skill 是免费的吗?
是的,Sogni Creative Agent Skill 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Sogni Creative Agent Skill 支持哪些平台?
Sogni Creative Agent Skill 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(darwin, linux, win32)。
谁开发了 Sogni Creative Agent Skill?
由 Mauvis Ledford(@krunkosaurus)开发并维护,当前版本 v3.6.0。