← 返回 Skills 市场

ERNIE Image Visual Promptsmith

Name: ERNIE Image Visual Promptsmith
Author: yoimiya66

作者 YOIMIYA66 · GitHub ↗ · v1.0.1 · MIT-0

cross-platform ✓ 安全检测通过

总下载

当前安装

版本数

在 OpenClaw 中安装

/install ernie-image-visual-promptsmith

功能描述

Generate ERNIE-Image-Turbo images through Baidu AI Studio and craft ERNIE-Image prompts for posters, comics, infographics, ecommerce images, UI-style visuals...

使用说明 (SKILL.md)

ERNIE-Image Visual Promptsmith

Use this community skill to craft ERNIE-Image prompts and generate images through the AI Studio ERNIE-Image-Turbo endpoint. It is not official Baidu or ERNIE-Image software.

Decide the Mode

Generate immediately when the user asks to generate, draw, create, make an image, or uses equivalent Chinese generation wording.
Return prompt-only guidance when the user asks to optimize, rewrite, improve, or review a prompt.
Ask one concise question only if an exact visible text string, language, or required aspect ratio is missing and guessing would likely break the result.

API Endpoint

Base: https://aistudio.baidu.com/llm/lmapi/v3
Submit: POST /images/generations
Full URL: https://aistudio.baidu.com/llm/lmapi/v3/images/generations
Auth header: Authorization: bearer \x3CBAIDU_AISTUDIO_API_KEY>
Platform header: X-Client-Platform: aistudio

API Key

Required environment variable: BAIDU_AISTUDIO_API_KEY
Get a key: https://aistudio.baidu.com/account/accessToken
If the key is missing, do not call the API. Tell the user to set BAIDU_AISTUDIO_API_KEY.

Triggers

Chinese examples: ERNIE image: \x3Cprompt>, Wenxin image: \x3Cprompt>, generate image: \x3Cprompt>, or equivalent Chinese wording for image generation.
English examples: ernie image: \x3Cprompt>, generate image: \x3Cprompt>, create image: \x3Cprompt>.
Treat text after the colon as the raw user prompt, improve it, choose a preset, then generate.
If the user asks to optimize, rewrite, improve, or review a prompt, return prompt-only guidance and do not call the API.

Prompt Workflow

Classify the image style: photorealistic, anime/manga, text-in-image, concept art, abstract/artistic, layout/composition, poster, ecommerce, infographic, comic/storyboard, UI screenshot style, or character-consistent visual.
Preserve immutable constraints: exact in-image text, language, subject count, character identity, spatial relationships, size, style, and forbidden elements.
Build the core prompt in five parts: subject -> action/context -> style -> lighting -> quality.
For layout-sensitive requests, append composition -> exact text -> spatial placement.
Keep in-image writing short when possible. Turn paragraphs into titles, labels, badges, or numbered lines.
For text rendering, put exact wording in quotes and specify placement, font weight, alignment, color, background contrast, and whitespace.
Choose a preset from auto, text-poster, infographic, comic, product, ui, photo, concept, or abstract.
Before generation, state:

Final Prompt: \x3Cprompt>
Preset: \x3Cpreset>
use_pe: \x3Ctrue or false>
Size: \x3Csize>
Reason: \x3Cwhy these settings fit ERNIE-Image>

Generation Workflow

Use the bundled Python script. Prefer python3; on Windows use python or py if needed.

python3 {baseDir}/scripts/generate.py --prompt "\x3CFINAL_PROMPT>" --preset \x3Cpreset>

For exact text, bilingual labels, UI, flowcharts, signs, comics, or already detailed prompts, pass --no-use-pe.

python3 {baseDir}/scripts/generate.py --prompt "\x3CFINAL_PROMPT>" --preset text-poster --no-use-pe

The script prints IMAGE_URL:\x3Curl> for URL responses and MEDIA:\x3Cabsolute_path> for each saved image. Return the saved media path to the user.

If BAIDU_AISTUDIO_API_KEY is missing, tell the user to get a key from https://aistudio.baidu.com/account/accessToken and set BAIDU_AISTUDIO_API_KEY.

Submit Payload

{
  "model": "ERNIE-Image-Turbo",
  "prompt": "\x3CFINAL_PROMPT>",
  "n": 1,
  "response_format": "url",
  "size": "1024x1024",
  "seed": 42,
  "use_pe": true,
  "num_inference_steps": 8,
  "guidance_scale": 1.0
}

Download and Output

response_format=url returns image URLs in data[]; the script prints IMAGE_URL:\x3Curl>.
The script downloads each URL immediately and saves the image locally.
The script prints MEDIA:\x3Cabsolute_path> for OpenClaw/ClawHub auto-attach.
URLs may expire; the local file remains available after download.
Output names are generated as ernie-image-\x3Ctimestamp>-\x3Cindex>.\x3Cext>.
Do not pass user-controlled filenames to shell commands.

Defaults

Model: ERNIE-Image-Turbo
Preset: auto
Count: 1
Response format: url
Seed: 42
text-poster, infographic, comic, product, and ui presets default to use_pe=false.
photo, concept, and abstract presets default to use_pe=true.

Negative Prompt Rules

Do not add text, letters, typography, Chinese text, or English text when the user wants readable writing.
Prefer precise negatives: distorted text, misspelled words, duplicated letters, unreadable typography, warped layout, cropped title, low contrast, blurry details, inconsistent panels, artifacts.
The API does not expose a separate negative prompt field in this skill. Express exclusions as natural language constraints inside the prompt, such as "avoid cluttered background" or "no visible watermark".

Retry Strategy

Text errors: reduce the amount of visible text, quote exact words once, add stronger placement and contrast, then use --no-use-pe.
Layout errors: simplify object count, name each region, use grid/split-screen/foreground/background terms, then keep the same seed.
Weak style: add camera/lens, art movement, medium, color temperature, material texture, and lighting direction.
Cluttered image: remove secondary elements, add negative space, use "avoid cluttered background", and switch to a simpler preset if needed.

References

Read references/api.md for parameters, command examples, and endpoint mapping.
Read references/prompt-architecture.md for ERNIE-Image prompt templates.
Read references/examples.md for acceptance-style examples.

安全使用建议

This skill is internally coherent: it needs your Baidu AI Studio API key and a Python runtime to call ERNIE-Image and save generated images locally. Before installing, confirm you are comfortable giving the skill your BAIDU_AISTUDIO_API_KEY (it will be sent in Authorization headers to aistudio.baidu.com and used for billing/usage). If you want to inspect behavior first, run the bundled script with --dry-run to see the request JSON without sending it. Also review whether the same API key is used elsewhere (avoid sharing highly privileged or long-lived keys). Finally, remember that the script downloads image URLs returned by the service and writes files to disk—verify the output directory and file permissions if that matters to you.

能力标签

requires-sensitive-credentials

能力评估

✓ Purpose & Capability

Name/description match the implementation: the script and docs call the Baidu AI Studio ERNIE-Image endpoint and include prompt-crafting guidance. Required env var (BAIDU_AISTUDIO_API_KEY) and python runtime are appropriate for this purpose.

ℹ Instruction Scope

SKILL.md instructs the agent to either return prompt guidance or run the bundled script to submit generation requests. The instructions and script only reference the declared env var and the API endpoint. Note: the script downloads returned image URLs and writes files locally (expected for image generation).

✓ Install Mechanism

This is an instruction-only skill with a small bundled Python script and no install spec. No external installers or untrusted download URLs are present.

✓ Credentials

Only BAIDU_AISTUDIO_API_KEY is required and declared as the primary credential. The key is necessary and proportionate for calling the remote image-generation API; no unrelated secrets or config paths are requested.

✓ Persistence & Privilege

always is false and the skill does not request persistent/system-wide changes. It saves generated images to a local output directory (configurable) but does not modify other skills or global agent settings.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install ernie-image-visual-promptsmith
安装完成后，直接呼叫该 Skill 的名称或使用 /ernie-image-visual-promptsmith 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.1

**Added explicit API usage details and guidance for ernie-image-visual-promptsmith** - Documented the full API endpoint, method, required headers, and authentication for image generation. - Provided examples of both Chinese and English triggers for image generation and prompt improvement workflows. - Added sample JSON payload for submitting image generation requests via API. - Described how the image URLs are handled, downloaded, saved, and named locally. - Clarified prohibited shell behaviors (no user-controlled filenames). - All core prompt crafting and retry strategies remain unchanged from previous version.

v1.0.0

- Initial release of the "ernie-image-visual-promptsmith" skill for generating ERNIE-Image-Turbo images via Baidu AI Studio. - Supports crafting structured prompts for posters, comics, infographics, ecommerce, UI visuals, bilingual text, and more. - Includes detailed guidance on managing visible text, image layouts, presets, negative constraints, and appropriate generation settings. - Requires user-provided Baidu AI Studio API key; not affiliated with Baidu. - Provides step-by-step prompt and generation workflow, command-line usage, default parameters, and troubleshooting/retry strategies.

元数据

Slug ernie-image-visual-promptsmith

版本 1.0.1

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 2

常见问题