← Back to Skills Marketplace
yoimiya66

ERNIE Image Visual Promptsmith

by YOIMIYA66 · GitHub ↗ · v1.0.1 · MIT-0
cross-platform ✓ Security Clean
68
Downloads
1
Stars
0
Active Installs
2
Versions
Install in OpenClaw
/install ernie-image-visual-promptsmith
Description
Generate ERNIE-Image-Turbo images through Baidu AI Studio and craft ERNIE-Image prompts for posters, comics, infographics, ecommerce images, UI-style visuals...
README (SKILL.md)

ERNIE-Image Visual Promptsmith

Use this community skill to craft ERNIE-Image prompts and generate images through the AI Studio ERNIE-Image-Turbo endpoint. It is not official Baidu or ERNIE-Image software.

Decide the Mode

  • Generate immediately when the user asks to generate, draw, create, make an image, or uses equivalent Chinese generation wording.
  • Return prompt-only guidance when the user asks to optimize, rewrite, improve, or review a prompt.
  • Ask one concise question only if an exact visible text string, language, or required aspect ratio is missing and guessing would likely break the result.

API Endpoint

  • Base: https://aistudio.baidu.com/llm/lmapi/v3
  • Submit: POST /images/generations
  • Full URL: https://aistudio.baidu.com/llm/lmapi/v3/images/generations
  • Auth header: Authorization: bearer \x3CBAIDU_AISTUDIO_API_KEY>
  • Platform header: X-Client-Platform: aistudio

API Key

  • Required environment variable: BAIDU_AISTUDIO_API_KEY
  • Get a key: https://aistudio.baidu.com/account/accessToken
  • If the key is missing, do not call the API. Tell the user to set BAIDU_AISTUDIO_API_KEY.

Triggers

  • Chinese examples: ERNIE image: \x3Cprompt>, Wenxin image: \x3Cprompt>, generate image: \x3Cprompt>, or equivalent Chinese wording for image generation.
  • English examples: ernie image: \x3Cprompt>, generate image: \x3Cprompt>, create image: \x3Cprompt>.
  • Treat text after the colon as the raw user prompt, improve it, choose a preset, then generate.
  • If the user asks to optimize, rewrite, improve, or review a prompt, return prompt-only guidance and do not call the API.

Prompt Workflow

  1. Classify the image style: photorealistic, anime/manga, text-in-image, concept art, abstract/artistic, layout/composition, poster, ecommerce, infographic, comic/storyboard, UI screenshot style, or character-consistent visual.
  2. Preserve immutable constraints: exact in-image text, language, subject count, character identity, spatial relationships, size, style, and forbidden elements.
  3. Build the core prompt in five parts: subject -> action/context -> style -> lighting -> quality.
  4. For layout-sensitive requests, append composition -> exact text -> spatial placement.
  5. Keep in-image writing short when possible. Turn paragraphs into titles, labels, badges, or numbered lines.
  6. For text rendering, put exact wording in quotes and specify placement, font weight, alignment, color, background contrast, and whitespace.
  7. Choose a preset from auto, text-poster, infographic, comic, product, ui, photo, concept, or abstract.
  8. Before generation, state:
Final Prompt: \x3Cprompt>
Preset: \x3Cpreset>
use_pe: \x3Ctrue or false>
Size: \x3Csize>
Reason: \x3Cwhy these settings fit ERNIE-Image>

Generation Workflow

Use the bundled Python script. Prefer python3; on Windows use python or py if needed.

python3 {baseDir}/scripts/generate.py --prompt "\x3CFINAL_PROMPT>" --preset \x3Cpreset>

For exact text, bilingual labels, UI, flowcharts, signs, comics, or already detailed prompts, pass --no-use-pe.

python3 {baseDir}/scripts/generate.py --prompt "\x3CFINAL_PROMPT>" --preset text-poster --no-use-pe

The script prints IMAGE_URL:\x3Curl> for URL responses and MEDIA:\x3Cabsolute_path> for each saved image. Return the saved media path to the user.

If BAIDU_AISTUDIO_API_KEY is missing, tell the user to get a key from https://aistudio.baidu.com/account/accessToken and set BAIDU_AISTUDIO_API_KEY.

Submit Payload

{
  "model": "ERNIE-Image-Turbo",
  "prompt": "\x3CFINAL_PROMPT>",
  "n": 1,
  "response_format": "url",
  "size": "1024x1024",
  "seed": 42,
  "use_pe": true,
  "num_inference_steps": 8,
  "guidance_scale": 1.0
}

Download and Output

  • response_format=url returns image URLs in data[]; the script prints IMAGE_URL:\x3Curl>.
  • The script downloads each URL immediately and saves the image locally.
  • The script prints MEDIA:\x3Cabsolute_path> for OpenClaw/ClawHub auto-attach.
  • URLs may expire; the local file remains available after download.
  • Output names are generated as ernie-image-\x3Ctimestamp>-\x3Cindex>.\x3Cext>.
  • Do not pass user-controlled filenames to shell commands.

Defaults

  • Model: ERNIE-Image-Turbo
  • Preset: auto
  • Count: 1
  • Response format: url
  • Seed: 42
  • text-poster, infographic, comic, product, and ui presets default to use_pe=false.
  • photo, concept, and abstract presets default to use_pe=true.

Negative Prompt Rules

  • Do not add text, letters, typography, Chinese text, or English text when the user wants readable writing.
  • Prefer precise negatives: distorted text, misspelled words, duplicated letters, unreadable typography, warped layout, cropped title, low contrast, blurry details, inconsistent panels, artifacts.
  • The API does not expose a separate negative prompt field in this skill. Express exclusions as natural language constraints inside the prompt, such as "avoid cluttered background" or "no visible watermark".

Retry Strategy

  • Text errors: reduce the amount of visible text, quote exact words once, add stronger placement and contrast, then use --no-use-pe.
  • Layout errors: simplify object count, name each region, use grid/split-screen/foreground/background terms, then keep the same seed.
  • Weak style: add camera/lens, art movement, medium, color temperature, material texture, and lighting direction.
  • Cluttered image: remove secondary elements, add negative space, use "avoid cluttered background", and switch to a simpler preset if needed.

References

  • Read references/api.md for parameters, command examples, and endpoint mapping.
  • Read references/prompt-architecture.md for ERNIE-Image prompt templates.
  • Read references/examples.md for acceptance-style examples.
Usage Guidance
This skill is internally coherent: it needs your Baidu AI Studio API key and a Python runtime to call ERNIE-Image and save generated images locally. Before installing, confirm you are comfortable giving the skill your BAIDU_AISTUDIO_API_KEY (it will be sent in Authorization headers to aistudio.baidu.com and used for billing/usage). If you want to inspect behavior first, run the bundled script with --dry-run to see the request JSON without sending it. Also review whether the same API key is used elsewhere (avoid sharing highly privileged or long-lived keys). Finally, remember that the script downloads image URLs returned by the service and writes files to disk—verify the output directory and file permissions if that matters to you.
Capability Tags
requires-sensitive-credentials
Capability Assessment
Purpose & Capability
Name/description match the implementation: the script and docs call the Baidu AI Studio ERNIE-Image endpoint and include prompt-crafting guidance. Required env var (BAIDU_AISTUDIO_API_KEY) and python runtime are appropriate for this purpose.
Instruction Scope
SKILL.md instructs the agent to either return prompt guidance or run the bundled script to submit generation requests. The instructions and script only reference the declared env var and the API endpoint. Note: the script downloads returned image URLs and writes files locally (expected for image generation).
Install Mechanism
This is an instruction-only skill with a small bundled Python script and no install spec. No external installers or untrusted download URLs are present.
Credentials
Only BAIDU_AISTUDIO_API_KEY is required and declared as the primary credential. The key is necessary and proportionate for calling the remote image-generation API; no unrelated secrets or config paths are requested.
Persistence & Privilege
always is false and the skill does not request persistent/system-wide changes. It saves generated images to a local output directory (configurable) but does not modify other skills or global agent settings.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install ernie-image-visual-promptsmith
  3. After installation, invoke the skill by name or use /ernie-image-visual-promptsmith
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.1
**Added explicit API usage details and guidance for ernie-image-visual-promptsmith** - Documented the full API endpoint, method, required headers, and authentication for image generation. - Provided examples of both Chinese and English triggers for image generation and prompt improvement workflows. - Added sample JSON payload for submitting image generation requests via API. - Described how the image URLs are handled, downloaded, saved, and named locally. - Clarified prohibited shell behaviors (no user-controlled filenames). - All core prompt crafting and retry strategies remain unchanged from previous version.
v1.0.0
- Initial release of the "ernie-image-visual-promptsmith" skill for generating ERNIE-Image-Turbo images via Baidu AI Studio. - Supports crafting structured prompts for posters, comics, infographics, ecommerce, UI visuals, bilingual text, and more. - Includes detailed guidance on managing visible text, image layouts, presets, negative constraints, and appropriate generation settings. - Requires user-provided Baidu AI Studio API key; not affiliated with Baidu. - Provides step-by-step prompt and generation workflow, command-line usage, default parameters, and troubleshooting/retry strategies.
Metadata
Slug ernie-image-visual-promptsmith
Version 1.0.1
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 2
Frequently Asked Questions

What is ERNIE Image Visual Promptsmith?

Generate ERNIE-Image-Turbo images through Baidu AI Studio and craft ERNIE-Image prompts for posters, comics, infographics, ecommerce images, UI-style visuals... It is an AI Agent Skill for Claude Code / OpenClaw, with 68 downloads so far.

How do I install ERNIE Image Visual Promptsmith?

Run "/install ernie-image-visual-promptsmith" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is ERNIE Image Visual Promptsmith free?

Yes, ERNIE Image Visual Promptsmith is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does ERNIE Image Visual Promptsmith support?

ERNIE Image Visual Promptsmith is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created ERNIE Image Visual Promptsmith?

It is built and maintained by YOIMIYA66 (@yoimiya66); the current version is v1.0.1.

💬 Comments