Description

Generate or edit images with Volcengine Seedream (Doubao). Use for image creation requests incl. text-to-image, image-to-image, multi-reference fusion, seque...

README (SKILL.md)

Seedream Image Generation

Name: Seedream Volcengine
Author: cyberkurry

Generate and edit images via Volcengine Doubao-Seedream API using {baseDir}/scripts/generate_image.py.

This skill wraps the Volcengine Ark Image Generation API. It supports text-to-image, image editing, multi-reference fusion, sequential/group generation, PNG output, prompt optimization, web search, and streaming output. Works with Seedream 5.0-lite (default), 4.5, and 4.0 models.

Quick start

# Text to image (simplest form)
uv run {baseDir}/scripts/generate_image.py --prompt "一只赛博朋克风格的猫"

# Image editing (with reference image)
uv run {baseDir}/scripts/generate_image.py --prompt "变为水墨画风格" --image "https://example.com/photo.jpg"

# Multi-reference fusion
uv run {baseDir}/scripts/generate_image.py --prompt "融合风格" --image "https://a.jpg" --image "https://b.jpg"

# Group/sequential generation
uv run {baseDir}/scripts/generate_image.py --prompt "四格漫画" --sequential --max-images 4

# List available models
uv run {baseDir}/scripts/generate_image.py --list-models

API key

Set VOLC_API_KEY env var or pass --api-key. Never hardcode keys in scripts.

export VOLC_API_KEY="your-key"
# or
--api-key your-key

Models

Default model: doubao-seedream-5-0-260128 (5.0-lite). Override with --model.

Model comparison

Feature	5.0-lite	4.5	4.0
Model ID	`doubao-seedream-5-0-260128`	`doubao-seedream-4-5-251128`	`doubao-seedream-4-0-250828`
Alias	`doubao-seedream-5-0-lite-260128`	—	—
Preset sizes	2K, 3K, 4K	2K, 4K	1K, 2K, 4K
Output formats	png, jpeg	jpeg only	jpeg only
Prompt optimization	standard	standard	standard, fast
Streaming	✅	✅	✅
Web search	✅ (only model)	❌	❌
Rate limit	500 IPM	500 IPM	500 IPM

When to choose which model

5.0-lite (default) — Best overall quality. Only model supporting PNG output, 3K resolution, and web search. Use for most tasks.
4.5 — High quality, slightly different style. Good alternative to 5.0-lite.
4.0 — Supports --prompt-optimization fast for quicker results. Good for rapid iteration. Also supports 1K for smaller/faster generation.

Size and resolution

Two ways to specify size (cannot mix):

Method 1 — Preset (recommended for most cases):

--size 2K    # 2048x2048 (default)
--size 3K    # 3072x3072 (5.0-lite only)
--size 4K    # 4096x4096 (all models)
--size 1K    # 1024x1024 (4.0 only)

Method 2 — Pixel format (for custom aspect ratios):

--size 3840x2160    # 16:9 widescreen
--size 1080x1920    # 9:16 portrait
--size 1200x1200    # custom square

Resolution table (preset sizes × aspect ratios)

When using preset sizes, the model picks the best aspect ratio based on your prompt. For precise control, use pixel format.

Preset	1:1	4:3	3:4	16:9	9:16	3:2	2:3	21:9
1K	1024×1024	1152×864	864×1152	1280×720	720×1280	1248×832	832×1248	1512×648
2K	2048×2048	2304×1728	1728×2304	2848×1600	1600×2848	2496×1664	1664×2496	3136×1344
3K	3072×3072	3456×2592	2592×3456	4096×2304	2304×4096	3744×2496	2496×3744	4704×2016
4K	4096×4096	4704×3520	3520×4704	5504×3040	3040×5504	4992×3328	3328×4992	6240×2656

Pixel format constraints

5.0-lite: total pixels ∈ [3,686,400 ~ 16,777,216], aspect ratio ∈ [1/16 ~ 16]
4.5: total pixels ∈ [3,686,400 ~ 16,777,216], aspect ratio ∈ [1/16 ~ 16]
4.0: total pixels ∈ [921,600 ~ 16,777,216], aspect ratio ∈ [1/16 ~ 16]

Examples:

✅ 3840x2160 → 8,294,400 pixels, ratio 1.78 → valid for all models
✅ 2160x3840 → same pixels, ratio 0.56 → valid for all models
❌ 1500x1500 → 2,250,000 pixels \x3C 3,686,400 minimum → invalid for 5.0-lite/4.5

Flags reference

Required

--prompt — Text description for image generation. Supports Chinese and English. Recommended under 300 Chinese characters or 600 English words.

Model and size

--model — Model ID (default: doubao-seedream-5-0-260128)
--size — Preset (1K/2K/3K/4K) or pixel (WIDTHxHEIGHT, default: 2K)

Image input

--image — Reference image URL or base64. Repeat for multiple images (max 14 total). Used for image editing and multi-reference fusion.

Supported input formats: jpeg, png, webp, bmp, tiff, gif, heic, heif. Constraints: max 30MB per image, max 60MP (width×height), aspect ratio [1/16, 16].

Output control

--output-format — jpeg (default) or png (5.0-lite only)
--response-format — url (default, 24h valid) or b64_json (base64 encoded)
--watermark / --no-watermark — Enable/disable "AI生成" watermark (default: enabled)

Generation mode

--sequential — Enable group/sequential generation (multiple related images)
--max-images — Number of images in sequential mode (1-15, default: 1)
--stream — Enable streaming output (get results as generated)
--web-search — Enable web search for real-time info (5.0-lite only). Uses tools: [{type: "web_search"}] API parameter.

Prompt enhancement

--prompt-optimization — standard (all models) or fast (4.0 only). Rewrites prompt for better results.

Utilities

--list-models — List available Seedream models and their capabilities
--api-key — Volcengine API key (or set VOLC_API_KEY env)

Use cases with complete examples

1. Text to image (basic)

uv run {baseDir}/scripts/generate_image.py --prompt "充满活力的特写肖像，模特眼神犀利，头戴雕塑感帽子，色彩拼接丰富，Vogue杂志封面美学"

2. Text to image (with specific size)

# 4K landscape
uv run {baseDir}/scripts/generate_image.py --prompt "壮丽的山川日出，金色阳光穿过云层" --size 4K

# Custom widescreen
uv run {baseDir}/scripts/generate_image.py --prompt "赛博朋克城市夜景" --size 3840x2160

# Portrait for phone wallpaper
uv run {baseDir}/scripts/generate_image.py --prompt "极简抽象艺术" --size 1080x1920

3. Image editing (single reference)

uv run {baseDir}/scripts/generate_image.py \
  --prompt "保持模特姿势和构图不变，将服装材质从银色金属改为完全透明的清水" \
  --image "https://example.com/original.jpg"

4. Multi-reference fusion

# Combine 2 images
uv run {baseDir}/scripts/generate_image.py \
  --prompt "将图1的服装换为图2的服装" \
  --image "https://example.com/person.jpg" \
  --image "https://example.com/clothing.jpg"

# Combine 3+ images
uv run {baseDir}/scripts/generate_image.py \
  --prompt "融合这三张图的风格特征，生成统一视觉" \
  --image "https://example.com/1.jpg" \
  --image "https://example.com/2.jpg" \
  --image "https://example.com/3.jpg"

5. Sequential/group generation

# Generate 4 related images
uv run {baseDir}/scripts/generate_image.py \
  --prompt "生成一组电影级科幻写实风的4张影视分镜" \
  --sequential --max-images 4

# With reference image
uv run {baseDir}/scripts/generate_image.py \
  --prompt "参考这张图，生成同角色在早晨、中午、晚上的连续画面" \
  --image "https://example.com/character.jpg" \
  --sequential --max-images 3

6. PNG output

uv run {baseDir}/scripts/generate_image.py \
  --prompt "高清产品图，白色背景，专业摄影灯光" \
  --output-format png

7. Prompt optimization

# Standard mode (higher quality, all models)
uv run {baseDir}/scripts/generate_image.py \
  --prompt "一只猫" \
  --prompt-optimization standard

# Fast mode (quicker, 4.0 only)
uv run {baseDir}/scripts/generate_image.py \
  --prompt "一只猫" \
  --model doubao-seedream-4-0-250828 \
  --prompt-optimization fast

8. Web search (real-time info, 5.0-lite only)

# Weather forecast image
uv run {baseDir}/scripts/generate_image.py \
  --prompt "制作一张上海未来5日的天气预报图，扁平化插画风格" \
  --web-search

# Current events
uv run {baseDir}/scripts/generate_image.py \
  --prompt "2026年最流行的时尚趋势海报" \
  --web-search

9. No watermark (commercial use)

uv run {baseDir}/scripts/generate_image.py \
  --prompt "产品宣传图，专业商业摄影" \
  --no-watermark

Prompt writing tips

Language: Supports Chinese and English. English tends to produce slightly better results for complex scenes, but Chinese works well too.
Length: Under 300 Chinese characters or 600 English words recommended. Too long → model may ignore some details.
Specificity: Be specific. "赛博朋克风格的猫，霓虹灯光，雨夜街道，4K，电影感" > "一只猫"
Style keywords: Append style terms for control:
- 水墨画风格 (ink painting)
- 油画风格 (oil painting)
- 写实摄影 (photorealistic)
- 扁平化插画 (flat illustration)
- 电影感 (cinematic)
- 杂志封面 (magazine cover)
Structure: For complex scenes, describe subject → composition → style → lighting → quality

Output interpretation

Script stdout contains exactly one of:

MEDIA_URL: \x3Curl> — Image download link. Valid for 24 hours. Use markdown: ![description](url)
MEDIA_B64: \x3Cbase64> — Base64 encoded image data (when --response-format b64_json)
ERROR: \x3Cmsg> — Error occurred. Check message for details.

Error handling

Error	Cause	Fix
`API key required`	No key provided	Set `VOLC_API_KEY` env or pass `--api-key`
`API request failed: 401`	Invalid API key	Check key at Volcengine console
`API request failed: 429`	Rate limited (500 IPM)	Wait and retry
`No image data in response`	API error	Check prompt/params, retry
`WARNING: may not support`	Model+size/format mismatch	Check model capabilities table above

Workflow

Parse user's image request to determine: text-to-image, editing, fusion, or sequential
Choose appropriate model (default: 5.0-lite unless specific need)
Determine size: use preset for general, pixel format for specific aspect ratio
Build and run the command with appropriate flags
Parse stdout for MEDIA_URL: or MEDIA_B64:
Display to user as markdown image or file
If URL, remind user it expires in 24h if they need to save it

Input image requirements

Formats: jpeg, png, webp, bmp, tiff, gif, heic, heif
Max size: 30MB per image
Max pixels: 60,000,000 (width × height)
Aspect ratio: [1/16, 16]
Min dimension: > 14px per side
Max images: 14 per request
Total: input images + output images ≤ 15

Rate limits

All models: 500 images per minute (IPM). Plan batch operations accordingly.

Full reference

For detailed API parameter docs, see references/REFERENCE.md.

Usage Guidance

Install this if you trust Volcengine for image generation and are comfortable sending the prompts/images you choose to that provider. Use an environment variable for the API key, avoid sensitive reference images, watch usage costs especially with sequential generation, and verify uv/dependency installation through trusted channels.

Capability Analysis

Type: OpenClaw Skill Name: seedream-volcengine Version: 1.0.0 The skill is a legitimate wrapper for the Volcengine Seedream (Doubao) image generation API. The core logic in `scripts/generate_image.py` uses the standard `requests` library to interact with official Volcengine endpoints (ark.cn-beijing.volces.com) and handles authentication securely via environment variables. No evidence of data exfiltration, unauthorized execution, or malicious prompt injection was found in the code or documentation.

Capability Tags

requires-sensitive-credentials

Capability Assessment

ℹ Purpose & Capability

The documented purpose and visible code align: generate/edit images through Volcengine Seedream, with explicit support for prompts, reference images, sequential generation, web search, and streaming.

✓ Instruction Scope

Instructions are mostly explicit command examples and option references; no artifact-backed prompt override, hidden autonomous behavior, or forced tool-use instruction was found.

ℹ Install Mechanism

There is no install spec, but usage relies on uv and the README suggests a remote uv installer command; this is user-directed setup rather than automatic execution.

ℹ Credentials

The skill requires a Volcengine API key and sends request data to the documented Volcengine endpoint; this is proportionate for the purpose, though registry metadata lists no primary credential/env var.

ℹ Persistence & Privilege

The visible code reads an API key from an argument or environment variable and does not show credential storage, background persistence, or local file indexing.

Version History

v1.0.0

Initial release. Text-to-image/Image-to-image/Multi-image fusion (14 reference images)/Batch image generation (15 output images)/PNG output (5.0-lite)/Prompt optimization (standard+fast)/Web search (5.0-lite)/Streaming output/Watermark control. Models: 5.0-lite/4.5/4.0, 1K-4K+ custom pixels.

Metadata

Slug seedream-volcengine

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Seedream Volcengine?

Generate or edit images with Volcengine Seedream (Doubao). Use for image creation requests incl. text-to-image, image-to-image, multi-reference fusion, seque... It is an AI Agent Skill for Claude Code / OpenClaw, with 52 downloads so far.

How do I install Seedream Volcengine?

Run "/install seedream-volcengine" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Seedream Volcengine free?

Yes, Seedream Volcengine is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Seedream Volcengine support?

Seedream Volcengine is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Seedream Volcengine?

It is built and maintained by CyberKurry (@cyberkurry); the current version is v1.0.0.

More Skills

Seedream Volcengine