Baoyu Imagine
/install baoyu-imagine
Image Generation (AI SDK)
Official API-based image generation. Supports OpenAI GPT Image 2, Azure OpenAI, Google, OpenRouter, DashScope (阿里通义万象), Z.AI GLM-Image, MiniMax, Jimeng (即梦), Seedream (豆包) and Replicate.
User Input Tools
When this skill prompts the user, follow this tool-selection rule (priority order):
- Prefer built-in user-input tools exposed by the current agent runtime — e.g.,
AskUserQuestion,request_user_input,clarify,ask_user, or any equivalent. - Fallback: if no such tool exists, emit a numbered plain-text message and ask the user to reply with the chosen number/answer for each question.
- Batching: if the tool supports multiple questions per call, combine all applicable questions into a single call; if only single-question, ask them one at a time in priority order.
Concrete AskUserQuestion references below are examples — substitute the local equivalent in other runtimes.
Script Directory
{baseDir} = this SKILL.md's directory. Main script: {baseDir}/scripts/main.ts. Resolve ${BUN_X}: prefer bun; else npx -y bun; else suggest brew install oven-sh/bun/bun.
Step 0: Load Preferences ⛔ BLOCKING
This step MUST complete before any image generation — generation is blocked until EXTEND.md exists.
Check these paths in order; first hit wins:
| Path | Scope |
|---|---|
.baoyu-skills/baoyu-imagine/EXTEND.md |
Project |
${XDG_CONFIG_HOME:-$HOME/.config}/baoyu-skills/baoyu-imagine/EXTEND.md |
XDG |
$HOME/.baoyu-skills/baoyu-imagine/EXTEND.md |
User home |
- Found → load, parse, apply. If
default_model.[provider]is null → ask model only. - Not found → run first-time setup (
references/config/first-time-setup.md) using AskUserQuestion to collect provider + model + quality + save location. Save EXTEND.md, then continue. Do not generate images before this completes.
Legacy compatibility: if .baoyu-skills/baoyu-image-gen/EXTEND.md exists and the new path doesn't, the runtime renames it to baoyu-imagine. If both exist, the runtime leaves them alone and uses the new path.
EXTEND.md keys: default provider, default quality, default aspect ratio, default image size, OpenAI image API dialect, default models, batch worker cap, provider-specific batch limits. Schema: references/config/preferences-schema.md.
Usage
Minimum working examples — see references/usage-examples.md for the full set including per-provider invocations and batch mode.
# Basic
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image cat.png
# With aspect ratio and high quality
${BUN_X} {baseDir}/scripts/main.ts --prompt "A landscape" --image out.png --ar 16:9 --quality 2k
# Prompt from files
${BUN_X} {baseDir}/scripts/main.ts --promptfiles system.md content.md --image out.png
# With reference image
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --ref source.png
# Specific provider
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider dashscope --model qwen-image-2.0-pro
# OpenAI GPT Image 2
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider openai --model gpt-image-2
# Batch mode
${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json --jobs 4
Options
| Option | Description |
|---|---|
--prompt \x3Ctext>, -p |
Prompt text |
--promptfiles \x3Cfiles...> |
Read prompt from files (concatenated) |
--image \x3Cpath> |
Output image path (required in single-image mode) |
--batchfile \x3Cpath> |
JSON batch file for multi-image generation |
--jobs \x3Ccount> |
Worker count for batch mode (default: auto, max from config, built-in default 10) |
--provider google|openai|azure|openrouter|dashscope|zai|minimax|jimeng|seedream|replicate |
Force provider (default: auto-detect) |
--model \x3Cid>, -m |
Model ID — see provider references for defaults and allowed values |
--ar \x3Cratio> |
Aspect ratio (16:9, 1:1, 4:3, …) |
--size \x3CWxH> |
Explicit size (e.g., 1024x1024; for gpt-image-2, width/height must be multiples of 16, max edge 3840px, ratio no wider than 3:1) |
--quality normal|2k |
Quality preset (default: 2k) |
--imageSize 1K|2K|4K |
Image size for Google/OpenRouter (default: from quality) |
--imageApiDialect openai-native|ratio-metadata |
OpenAI-compatible endpoint dialect — use ratio-metadata for gateways that expect aspect-ratio size plus metadata.resolution |
--ref \x3Cfiles...> |
Reference images. Supported by Google multimodal, OpenAI GPT Image edits, Azure OpenAI edits (PNG/JPG only), OpenRouter multimodal models, Replicate supported families, MiniMax subject-reference, Seedream 5.0/4.5/4.0. Not supported by Jimeng, Seedream 3.0, SeedEdit 3.0 |
--n \x3Ccount> |
Number of images. Replicate requires --n 1 (single-output save semantics) |
--json |
JSON output |
Environment Variables
| Variable | Description |
|---|---|
OPENAI_API_KEY |
OpenAI API key |
AZURE_OPENAI_API_KEY |
Azure OpenAI API key |
OPENROUTER_API_KEY |
OpenRouter API key |
GOOGLE_API_KEY |
Google API key |
DASHSCOPE_API_KEY |
DashScope API key |
ZAI_API_KEY (alias BIGMODEL_API_KEY) |
Z.AI API key |
MINIMAX_API_KEY |
MiniMax API key |
REPLICATE_API_TOKEN |
Replicate API token |
JIMENG_ACCESS_KEY_ID, JIMENG_SECRET_ACCESS_KEY |
Jimeng (即梦) Volcengine credentials |
ARK_API_KEY |
Seedream (豆包) Volcengine ARK API key |
\x3CPROVIDER>_IMAGE_MODEL |
Per-provider model override (OPENAI_IMAGE_MODEL, GOOGLE_IMAGE_MODEL, DASHSCOPE_IMAGE_MODEL, ZAI_IMAGE_MODEL/BIGMODEL_IMAGE_MODEL, MINIMAX_IMAGE_MODEL, OPENROUTER_IMAGE_MODEL, REPLICATE_IMAGE_MODEL, JIMENG_IMAGE_MODEL, SEEDREAM_IMAGE_MODEL) |
AZURE_OPENAI_DEPLOYMENT (alias AZURE_OPENAI_IMAGE_MODEL) |
Azure default deployment |
\x3CPROVIDER>_BASE_URL |
Per-provider endpoint override |
AZURE_API_VERSION |
Azure image API version (default 2025-04-01-preview) |
JIMENG_REGION |
Jimeng region (default cn-north-1) |
OPENAI_IMAGE_API_DIALECT |
openai-native | ratio-metadata |
OPENROUTER_HTTP_REFERER, OPENROUTER_TITLE |
Optional OpenRouter attribution |
BAOYU_IMAGE_GEN_MAX_WORKERS |
Override batch worker cap |
BAOYU_IMAGE_GEN_\x3CPROVIDER>_CONCURRENCY |
Per-provider concurrency (e.g., BAOYU_IMAGE_GEN_REPLICATE_CONCURRENCY) |
BAOYU_IMAGE_GEN_\x3CPROVIDER>_START_INTERVAL_MS |
Per-provider start-gap |
Load priority: CLI args > EXTEND.md > env vars > \x3Ccwd>/.baoyu-skills/.env > ~/.baoyu-skills/.env
Model Resolution
Priority (highest → lowest) applies to every provider:
- CLI flag
--model \x3Cid> - EXTEND.md
default_model.[provider] - Env var
\x3CPROVIDER>_IMAGE_MODEL - Built-in default
For OpenAI, the built-in default is gpt-image-2. gpt-image-1.5, gpt-image-1, and GPT Image snapshots remain selectable with --model or OPENAI_IMAGE_MODEL.
For Azure, --model / default_model.azure is the Azure deployment name. AZURE_OPENAI_DEPLOYMENT is the preferred env var; AZURE_OPENAI_IMAGE_MODEL is kept as a backward-compatible alias. If your Azure deployment is named after the underlying model, use gpt-image-2; otherwise use the exact custom deployment name.
EXTEND.md overrides env vars: if EXTEND.md sets default_model.google: "gemini-3-pro-image-preview" and the env var sets GOOGLE_IMAGE_MODEL=gemini-3.1-flash-image-preview, EXTEND.md wins.
Display model info before each generation:
Using [provider] / [model]Switch model: --model \x3Cid> | EXTEND.md default_model.[provider] | env \x3CPROVIDER>_IMAGE_MODEL
OpenAI-Compatible Gateway Dialects
provider=openai means the auth and routing entrypoint is OpenAI-compatible. It does not guarantee the upstream image API uses OpenAI native semantics. When a gateway expects a different wire format, set default_image_api_dialect in EXTEND.md, OPENAI_IMAGE_API_DIALECT, or --imageApiDialect:
openai-native: pixelsize(1536x1024) and native OpenAI quality fieldsratio-metadata: aspect-ratiosize(16:9) plusmetadata.resolution(1K|2K|4K) andmetadata.orientation
Use openai-native for the OpenAI native API or strict clones; try ratio-metadata for compatibility gateways in front of Gemini or similar models. Current limitation: ratio-metadata applies only to text-to-image; reference-image edits still need openai-native or a provider with first-class edit support.
Provider-Specific Guides
Each provider has its own quirks (model families, size rules, ref support, limits). Read these when the user picks that provider or asks for non-default behavior:
| Provider | Reference |
|---|---|
| DashScope (Qwen-Image families, custom sizes) | references/providers/dashscope.md |
| Z.AI (GLM-Image, cogview-4) | references/providers/zai.md |
| MiniMax (image-01, subject-reference) | references/providers/minimax.md |
OpenRouter (multimodal models, /chat/completions flow) |
references/providers/openrouter.md |
| Replicate (nano-banana, Seedream, Wan) | references/providers/replicate.md |
Provider Selection
--refprovided + no--provider→ auto-select Google → OpenAI → Azure → OpenRouter → Replicate → Seedream → MiniMax (MiniMax's subject reference is more specialized toward character/portrait consistency)--providerspecified → use it (if--ref, must be google/openai/azure/openrouter/replicate/seedream/minimax)- Only one API key present → use that provider
- Multiple keys → default priority: Google → OpenAI → Azure → OpenRouter → DashScope → Z.AI → MiniMax → Replicate → Jimeng → Seedream
Quality Presets
| Preset | Google imageSize | OpenAI size | OpenRouter size | Replicate resolution | Use case |
|---|---|---|---|---|---|
normal |
1K | 1024px target | 1K | 1K | Quick previews |
2k (default) |
2K | 2048px target | 2K | 2K | Covers, illustrations, infographics |
Google/OpenRouter imageSize can be overridden with --imageSize 1K|2K|4K.
For OpenAI native gpt-image-2, normal maps to quality=medium and a low-latency valid size near the requested aspect ratio; 2k maps to quality=high and 2048px-class sizes such as 2048x2048, 2048x1152, or 1152x2048. Use explicit --size for valid custom or 4K outputs, e.g. 3840x2160.
Aspect Ratios
Supported: 1:1, 16:9, 9:16, 4:3, 3:4, 2.35:1.
- Google multimodal:
imageConfig.aspectRatio - OpenAI:
gpt-image-2uses the closest valid custom size for the requested ratio; older GPT Image and DALL·E models use their closest supported fixed size - OpenRouter:
imageGenerationOptions.aspect_ratio; if only--size \x3CWxH>is given, the ratio is inferred - Replicate: behavior is model-specific —
google/nano-banana*usesaspect_ratio,bytedance/seedream-*uses documented Replicate ratios, Wan 2.7 maps--arto a concretesize - MiniMax: official
aspect_ratiovalues; if--size \x3CWxH>is given without--ar, sendswidth/heightforimage-01
Generation Mode
Default: sequential. Batch parallel: enabled automatically when --batchfile contains 2+ pending tasks.
| Situation | Prefer | Why |
|---|---|---|
| One image, or 1-2 simple images | Sequential | Lower coordination overhead, easier debugging |
| Multiple images with saved prompt files | Batch (--batchfile) |
Reuses finalized prompts, applies shared throttling/retries, predictable throughput |
| Each image still needs its own reasoning / prompt writing / style exploration | Subagents | Work is still exploratory, each needs independent analysis |
Input is outline.md + prompts/ (e.g. from baoyu-article-illustrator) |
Batch — use scripts/build-batch.ts to assemble the payload |
The outline + prompt files already contain everything needed |
Rule of thumb: once prompt files are saved and the task is "generate all of these", prefer batch over subagents. Use subagents only when generation is coupled with per-image thinking or divergent creative exploration.
Parallel behavior:
- Default worker count is automatic, capped by config, built-in default 10
- Provider-specific throttling applies only in batch mode; defaults are tuned for throughput while avoiding RPM bursts
- Override with
--jobs \x3Ccount> - Each image retries up to 3 attempts
- Final output includes success count, failure count, and per-image failure reasons
Error Handling
- Missing API key → error with setup instructions
- Generation failure → auto-retry up to 3 attempts per image
- Invalid aspect ratio → warning, proceed with default
- Reference images with unsupported provider/model → error with fix hint
References
| File | Content |
|---|---|
references/usage-examples.md |
Extended CLI examples across providers and batch mode |
references/providers/dashscope.md |
DashScope families, sizes, limits |
references/providers/zai.md |
Z.AI GLM-image / cogview-4 |
references/providers/minimax.md |
MiniMax image-01 + subject reference |
references/providers/openrouter.md |
OpenRouter multimodal flow |
references/providers/replicate.md |
Replicate supported families + guardrails |
references/config/preferences-schema.md |
EXTEND.md schema |
references/config/first-time-setup.md |
First-time setup flow |
Extension Support
Custom configurations via EXTEND.md. See Step 0 for paths and schema.
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install baoyu-imagine - 安装完成后,直接呼叫该 Skill 的名称或使用
/baoyu-imagine触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
Baoyu Imagine 是什么?
AI image generation with OpenAI GPT Image 2, Azure OpenAI, Google, OpenRouter, DashScope, Z.AI GLM-Image, MiniMax, Jimeng, Seedream and Replicate APIs. Suppo... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 1004 次。
如何安装 Baoyu Imagine?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install baoyu-imagine」即可一键安装,无需额外配置。
Baoyu Imagine 是免费的吗?
是的,Baoyu Imagine 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Baoyu Imagine 支持哪些平台?
Baoyu Imagine 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Baoyu Imagine?
由 Jim Liu 宝玉(@jimliu)开发并维护,当前版本 v1.104.0。