← 返回 Skills 市场
sdk-team

Alibabacloud Bailian Image Creator

作者 alibabacloud-skills-team · GitHub ↗ · v0.0.1 · MIT-0
cross-platform ⚠ suspicious
91
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install alibabacloud-bailian-image-creator
功能描述
Generates, edits, and analyzes images using Alibaba Cloud DashScope API with supported qwen and wan2.7 models via predefined scripts.
使用说明 (SKILL.md)

AI Image Creator

Professional-grade AI image creation skill built on Alibaba Cloud DashScope API.

Warning: Before executing any image task, you must read and follow the "Mandatory Rules" section. Violations will cause task failure.

Mandatory Rules

First Principle: Must Use Existing Scripts

All image generation, editing, and understanding tasks must and can only be completed by running existing scripts in the scripts/ directory.

  • Do NOT write your own API call code -- Do not create new Python scripts to call DashScope API. You must use the scripts specified in the "Task-to-Script Mapping" table below
  • Do NOT use PIL/Pillow or other local libraries for image generation or editing (only allowed for auxiliary operations such as size validation)
  • Do NOT use third-party APIs -- Including but not limited to Pollinations.ai, Stability.ai, DALL-E, Midjourney
  • Do NOT create mock/simulated scripts -- Do not bypass real API calls in any way
  • Do NOT use deprecated models -- wanx-v1, wanx-v2, wan2.6-image and other legacy models are discontinued
  • Do NOT mix APIs -- Qwen models (qwen-) can only use MultiModalConversation.call(), Wanx models (wan2.7-) can only use ImageGeneration.call()
  • Do NOT use curl to call DashScope REST API -- Must call through provided scripts, do not write shell scripts as substitutes
  • Do NOT generate placeholder/blank images -- Do not use hardcoded byte streams, blank canvases, or PIL-scaled images to impersonate API-generated images
  • Do NOT claim success when API calls fail -- Must truthfully report failure reasons, do not generate false success reports

Task-to-Script Mapping (The Only Execution Method)

Based on user request keywords, look up the table to select the script. No other methods are allowed:

User Request Keywords Script API Default Model
Wanx, wan2.7, reference image + generation, style fusion, multi-image fusion, graffiti-on-car, virtual try-on wanx_generate.py ImageGeneration.call() wan2.7-image
4K HD text-to-image wanx_generate.py ImageGeneration.call() wan2.7-image-pro
A set of coherent images, image series generation wanx_generate.py ImageGeneration.call() wan2.7-image-pro
Text-to-image, generate image, draw a picture (no reference image) text_to_image.py MultiModalConversation.call() qwen-image-2.0-pro
Edit image (URL input) image_edit.py MultiModalConversation.call() qwen-image-edit-max
Edit image (local file) image_edit_base64.py MultiModalConversation.call() qwen-image-edit-max
Analyze/understand image content, describe image image_understanding.py chat.completions.create() qwen3.5-plus

Disambiguation Priority Rules (when multiple keywords match simultaneously):

  1. User mentions "Wanx" or "wan2.7" -> Use wanx_generate.py directly, regardless of whether "generate image" or similar words are present
  2. User provides reference image URLs and requests generating a new image -> Use wanx_generate.py (Wanx supports multi-image reference input)
  3. Text description only, no reference image -> Use text_to_image.py

Allowed Models:

Script Allowed Models
text_to_image.py qwen-image-2.0-pro, qwen-image-2.0
wanx_generate.py wan2.7-image-pro, wan2.7-image
image_edit.py / image_edit_base64.py qwen-image-edit-max, qwen-image-2.0-pro, qwen-image-edit-plus, qwen-image-edit
image_understanding.py qwen3.5-plus, qwen-vl-max, qwen-vl-plus

API Key Security Management

Scripts automatically handle key retrieval via api_key.py. The Agent does not need to and should not manually extract, set, or pass API Key values.

  1. Key retrieval is automated: Scripts internally call api_key.py to automatically obtain keys from config files/environment variables, or auto-create via CLI. The Agent only needs to run the script command
  2. Never hardcode any form of key: Including api_key = "sk-...", export DASHSCOPE_API_KEY="sk-...", and assigning keys in shell scripts
  3. Never extract keys from CLI output: The key value returned by aliyun modelstudio create-api-key is automatically saved by api_key.py. The Agent must not write this value into any script, variable, or file
  4. Never expose keys in any output: Including generated scripts, shell commands, log files (summary.md, task_summary.md, execution_log.md, etc.), and terminal output containing strings starting with sk-
  5. Never read or print keys from config files: Do not use cat ~/.aliyun/config.json, jq, python -c, or other commands to read and output API Key values from config files
  6. Never directly call CLI key commands: Do not directly run aliyun modelstudio create-api-key or aliyun modelstudio list-api-keys. These commands are only called internally by api_key.py
  7. Mandatory self-check before task completion: Run grep -rn "sk-" \x3Coutput_directory>/ to check all output files; if any strings starting with sk- are found (excluding "sk-xxx" placeholders), delete the affected files and regenerate

API Call Result Validation

  1. Check status code: API response status_code must be 200, otherwise abort the task
  2. Check output content: Response must contain valid image URLs (starting with http). No valid URL means failure
  3. Never fabricate results: Do not generate simulated data, false success reports, or placeholder image URLs
  4. Never mock/simulate: Do not create mock scripts or simulate API calls to bypass real calls
  5. Correct behavior on failure:
    • Print error information (status code, error code, error message)
    • Stop executing subsequent steps
    • If it's a key issue, api_key.py will automatically retry creation
    • If retries still fail, report the real error reason without concealment

Output Validation (Must execute before reporting task completion)

  1. Check output files: Confirm image files exist in the output directory and have reasonable size (typically > 10KB)
  2. Key leak scan: Run grep -rn "sk-" \x3Coutput_directory>/ to check all output files (including logs, summaries, scripts). If leaks are found, delete and regenerate
  3. No false claims: If API calls actually failed or were not executed, do not claim success in the task summary (e.g., listing checkmarks)
  4. Blank image detection: If output images are blank, solid color, or merely scaled versions of input images, treat as task failure

Output Size Control

Model Family Size Format Supported Values
qwen-image-2.0-pro / qwen-image-2.0 (text-to-image) "width*height" 1024*1024, 720*1280, 1280*720, 1440*720, 720*1440
qwen-image-edit-* / qwen-image-2.0-pro (editing) "width*height" Width and height each [512, 2048], freely combined
wan2.7-image-pro (text-to-image) Preset tier 1K, 2K, 4K
wan2.7-image-pro (editing/series) Preset tier 1K, 2K
wan2.7-image Preset tier 1K, 2K

Mapping Rules:

  1. User requests exact pixels (e.g., 1280x720) -> Choose Qwen series, set size='1280*720'
  2. User requests portrait/landscape -> Use the closest fixed size for Qwen text-to-image (e.g., portrait -> 720*1280)
  3. Wanx models -> size can only be 1K/2K/4K, not pixel values

Execution Flow

Step 1: Install Dependencies

pip install -r scripts/requirements.txt

Step 2: Select and Run Script Based on "Task-to-Script Mapping" Table

Selection Logic: First check if the user mentions "Wanx"/"wan2.7" or provides reference image URLs -> If yes, use wanx_generate.py; otherwise match by other keywords in the mapping table.

API Key is automatically managed by scripts (three-level retrieval: config file -> environment variable -> auto-create). No manual key handling is needed. The Agent only needs to run the commands in the table below. Do not write your own scripts, set environment variables, or use curl to call APIs.

Scenario Command Example
Text-to-image (Qwen) python scripts/text_to_image.py \x3Cprompt> [size] [model] python scripts/text_to_image.py 'An orange cat sitting on a windowsill, 8K' 1024*1024 qwen-image-2.0-pro
Text-to-image / Multi-image fusion (Wanx) python scripts/wanx_generate.py \x3Cprompt> [ref_image_URLs...] python scripts/wanx_generate.py 'Paint the graffiti from image 2 on the car in image 1' url1 url2
Image editing (URL input) python scripts/image_edit.py \x3Cinstruction> \x3Cimage_URL1> [URL2] [URL3] python scripts/image_edit.py 'Change the background to white' https://example.com/photo.png
Image editing (local file) python scripts/image_edit_base64.py \x3Cinstruction> \x3Clocal_image1> [image2] [image3] python scripts/image_edit_base64.py 'Change the background to white' ./photo.png
Image understanding python scripts/image_understanding.py \x3Cimage_URL> [question] python scripts/image_understanding.py https://example.com/photo.jpg 'What is in this image?'

Step 3: Return Results

Scripts will print the list of generated image URLs; text_to_image.py also auto-downloads images to the current directory.


Model Selection Guide

Condition Choice Reason
Need accurate in-image text rendering qwen-image-2.0-pro Qwen series has the strongest text rendering capability
Need 4K ultra-high definition wan2.7-image-pro Only Wanx pro text-to-image supports 4K
Need a set of style-consistent images wan2.7-image-pro + enable_sequential Wanx exclusive image series capability
Product/industrial design editing qwen-image-edit-max Stronger geometric reasoning and character consistency
Multi-image reference + composition wan2.7-image Wanx supports flexible multi-image reference editing
Image content analysis/Q&A qwen3.5-plus General multimodal understanding, cost-effective
Complex visual reasoning qwen-vl-max Stronger visual understanding capability
Low cost priority qwen-image-2.0 / wan2.7-image Standard version at lower cost
No special requirements (default) qwen-image-2.0-pro Most balanced overall capability

Prompt Design Guide

For detailed guide, see Prompt Design Guide

Prompt structure formula: [Subject description] + [Scene/environment] + [Lighting/mood] + [Composition/angle] + [Art style] + [Quality parameters]

Complete example:

A 30-year-old Asian male detective standing on a rainy night Tokyo street, wearing a dark gray trench coat, holding a black umbrella, neon red and blue lights reflecting on his face, raindrops dripping along the umbrella edge, puddles on the ground reflecting brilliant city lights, Rembrandt lighting, volumetric fog, cinematic composition, 35mm anamorphic lens, film grain, moody atmosphere, high detail, photorealistic, 8K

Negative prompt template (Qwen series supports negative_prompt, Wanx series does not):

low quality, blurry, pixelated, oversaturated, deformed hands, extra fingers, distorted face, uncanny valley, text, watermark, logo, signature, cropped, out of frame, worst quality

Reference Information

Environment & Dependencies

Package Version Constraint Purpose
dashscope ==1.25.16 Text-to-image, image editing, Wanx generation
openai ==2.23.0 Image understanding (OpenAI-compatible interface)

API Key Configuration

This skill manages DashScope API Keys through scripts/api_key.py, which automatically validates key format (detects invalid keys like sk-sp-).

Retrieval Priority: ~/.aliyun/config.json > Environment variable DASHSCOPE_API_KEY > Auto-create via Alibaba Cloud CLI

Method 1: Alibaba Cloud CLI Config File (Recommended)

Add a dashscope field to the current profile in ~/.aliyun/config.json:

{
  "current": "default",
  "profiles": [
    {
      "name": "default",
      "mode": "AK",
      "access_key_id": "...",
      "access_key_secret": "...",
      "dashscope": {
        "api_key": "sk-xxx",
        "api_key_id": "4359606"
      }
    }
  ]
}

Method 2: System Environment Variable

export DASHSCOPE_API_KEY=sk-xxx
Item Description
Key Format sk-xxx (standard DashScope API Key)
Not Supported sk-sp-xxx (Coding Plan Key, does not support image generation)
Get Key https://help.aliyun.com/zh/model-studio/get-api-key
Security Never hardcode; use from api_key import get_api_key; never output full API Key values to terminal, logs, or any files

Alibaba Cloud CLI Configuration (API Key Auto-Create/Delete)

scripts/api_key.py creates and deletes API Keys via aliyun modelstudio commands. Complete the following setup before use:

# Enable AI-Mode (allow Agent to call CLI)
aliyun configure ai-mode enable

# Set User-Agent
aliyun configure ai-mode set-user-agent --user-agent "AlibabaCloud-Agent-Skills/alibabacloud-bailian-image-creator"

# Update plugins to latest version
aliyun plugin update

# Install ModelStudio plugin (if not already installed)
aliyun plugin install --names aliyun-cli-modelstudio --enable-pre

# Disable AI-Mode after task completion
aliyun configure ai-mode disable
Command Purpose Called From
aliyun modelstudio list-workspaces Get Bailian Workspace ID api_key.py: generate_api_key()
aliyun modelstudio create-api-key Create DashScope API Key api_key.py: generate_api_key()
aliyun modelstudio list-api-keys List existing API Keys (used for limit recycling) api_key.py: generate_api_key()
aliyun modelstudio delete-api-key Delete cloud API Key api_key.py: delete_api_key() / generate_api_key()

Regions & API Endpoints

Region DashScope API OpenAI Compatible Mode
Beijing (China) https://dashscope.aliyuncs.com/api/v1 https://dashscope.aliyuncs.com/compatible-mode/v1
Singapore https://dashscope-intl.aliyuncs.com/api/v1 https://dashscope-intl.aliyuncs.com/compatible-mode/v1
Virginia (US) https://dashscope-us.aliyuncs.com/api/v1 https://dashscope-us.aliyuncs.com/compatible-mode/v1

Note: API Keys are not interchangeable between regions.

Script Parameter Reference

text_to_image.py -- Text-to-Image (Qwen Series)

Parameter Description
size Only supports 1024*1024, 720*1280, 1280*720, 1440*720, 720*1440
prompt_extend True to auto-optimize prompts
negative_prompt Negative prompt to avoid generating undesired content
watermark Whether to add watermark

image_edit.py / image_edit_base64.py -- Image Editing (Qwen Series)

Parameter Type Description
n int Number of output images, 1-6 (qwen-image-edit fixed at 1)
size str Output resolution, format "width*height", range [512, 2048]
negative_prompt str Negative prompt
prompt_extend bool Smart prompt rewriting, default true
watermark bool Whether to add watermark, default false
seed int Random seed [0, 2147483647], for result reproduction

Editing Model Comparison:

Model Strengths Use Case
qwen-image-edit-max Strong geometric reasoning, character consistency Product design, precise editing
qwen-image-2.0-pro Strong text rendering, realistic quality General editing, text editing
qwen-image-edit-plus Multi-image output and custom resolution General editing
qwen-image-edit Basic version, fixed single image output Simple editing

Common Resolution Recommendations:

Aspect Ratio Recommended Resolution
1:1 1024*1024, 1536*1536
2:3 768*1152, 1024*1536
3:2 1152*768, 1536*1024
3:4 960*1280, 1080*1440
4:3 1280*960, 1440*1080
9:16 720*1280, 1080*1920
16:9 1280*720, 1920*1080
21:9 1344*576, 2048*872

Image Input Methods:

  • HTTP/HTTPS URL: "https://example.com/image.png"
  • Base64 Encoding: "data:image/png;base64,..." -- See scripts/image_edit_base64.py
  • Input: 1-3 reference images; Output: 1-6 images (controlled via n parameter)

wanx_generate.py -- Wanx Image Generation

Parameter Description
size 1K, 2K, 4K (4K only for wan2.7-image-pro text-to-image)
enable_sequential Series mode, generates 1-12 style-consistent images
thinking_mode Default true, enhances reasoning quality (increases latency)
n Normal mode 1-4, series mode 1-12
Model Strengths
wan2.7-image-pro Pro version, text-to-image supports 4K HD
wan2.7-image Standard version, faster generation

image_understanding.py -- Image Understanding

Uses OpenAI-compatible mode to call DashScope. Default model qwen3.5-plus (general understanding), optional qwen-vl-max (complex reasoning).

Pricing

Model Unit Price Notes
qwen-image-2.0-pro CN price Pro version, strongest text rendering and realistic quality
qwen-image-edit-max CN price Pro version, image generation and editing fusion
qwen-image-2.0 CN price Standard version
qwen-image-edit-plus CN price Standard editing
wan2.7-image-pro CN price Wanx pro version
wan2.7-image CN price Wanx standard version

New users get 100 free image credits (valid for 90 days) after activating Bailian. View bills: https://usercenter2.aliyun.com/finance/expense-center/overview

Best Practices

  1. Use prompt_extend=True to let the model auto-optimize prompts
  2. Qwen series supports negative_prompt, Wanx series does not support this parameter
  3. Choose appropriate sizes based on requirements to avoid unnecessary large image generation
  4. Use the n parameter to generate multiple images at once for selection
  5. Use URLs for online images, Base64 for small local files, file:// protocol for large files

Error Handling

Scripts have built-in error handling logic. For errors, refer to: https://help.aliyun.com/zh/model-studio/developer-reference/error-code

Related Resources

Script Function Description
text_to_image.py Text-to-image Generate images from text descriptions, supports auto-download
image_edit.py Image editing Supports URL, local file, and Base64 input methods
image_edit_base64.py Base64 image editing Demonstrates Base64 encoding method
base64_tool.py Base64 tool CLI tool for image-Base64 conversion
wanx_generate.py Wanx generation Uses wan2.7-image model for generation
image_understanding.py Image understanding Analyze image content, answer questions
安全使用建议
Install only if you are comfortable letting the skill use your Alibaba Cloud CLI profile and DashScope billing. Prefer providing a limited, pre-created API key, review the RAM permissions before granting CreateApiKey/DeleteApiKey, manually manage the CLI plugin if possible, and avoid sending sensitive images unless you intend to upload them to Alibaba DashScope.
功能分析
Type: OpenClaw Skill Name: alibabacloud-bailian-image-creator Version: 0.0.1 The skill bundle automates the management of Alibaba Cloud DashScope API keys by interacting with the 'aliyun' CLI, including auto-installing plugins and enabling 'ai-mode' (scripts/api_key.py, SKILL.md). While these actions are documented and serve the stated purpose of providing a seamless image generation experience, the ability to create, list, and delete cloud credentials and modify CLI configurations represents a significant security risk and a broad attack surface. The scripts also perform file I/O on the user's cloud configuration (~/.aliyun/config.json) and make external network calls to download generated images (scripts/text_to_image.py).
能力标签
cryptorequires-sensitive-credentials
能力评估
Purpose & Capability
The provided scripts and references coherently support the stated purpose: text-to-image, image editing, and image understanding through Alibaba DashScope models.
Instruction Scope
The SKILL.md rules route image tasks through the included scripts and prohibit manual key extraction; this is restrictive but aligned with the advertised workflow.
Install Mechanism
Registry metadata says there is no install spec and no required binaries, but scripts/api_key.py can automatically run an Alibaba Cloud CLI plugin install command with --enable-pre.
Credentials
The skill can read and write ~/.aliyun/config.json, use the current Alibaba Cloud CLI profile, and create/recycle DashScope API keys; this is high-impact cloud account access for an image tool.
Persistence & Privilege
The API key module persists DashScope keys into the Alibaba CLI config and the reference policy includes CreateApiKey/DeleteApiKey permissions, so users should explicitly approve and scope those privileges.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install alibabacloud-bailian-image-creator
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /alibabacloud-bailian-image-creator 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v0.0.1
Initial release of alibabacloud-bailian-image-creator skill. - Supports AI image generation, editing, and understanding using Alibaba Cloud DashScope API. - Strictly enforces all tasks via verified scripts; no custom code or third-party services allowed. - Automatically manages API keys and security with zero manual key handling required. - Provides detailed task-to-script and model selection mapping for different user requests. - Implements robust validation for API responses, output files, and security before reporting success.
元数据
Slug alibabacloud-bailian-image-creator
版本 0.0.1
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

Alibabacloud Bailian Image Creator 是什么?

Generates, edits, and analyzes images using Alibaba Cloud DashScope API with supported qwen and wan2.7 models via predefined scripts. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 91 次。

如何安装 Alibabacloud Bailian Image Creator?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install alibabacloud-bailian-image-creator」即可一键安装,无需额外配置。

Alibabacloud Bailian Image Creator 是免费的吗?

是的,Alibabacloud Bailian Image Creator 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Alibabacloud Bailian Image Creator 支持哪些平台?

Alibabacloud Bailian Image Creator 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Alibabacloud Bailian Image Creator?

由 alibabacloud-skills-team(@sdk-team)开发并维护,当前版本 v0.0.1。

💬 留言讨论