/install gemini-image-generation
\r \r
Image Generation\r
\r Use this skill when you need to create one or more image files from a text prompt, or edit one or more existing images with Gemini.\r \r
Requirements\r
\r \r
~/.openclaw/openclaw.jsonmust include$.skills.entries["gemini-image-generation"].enabledset totrue.\r~/.openclaw/openclaw.jsonmust include$.skills.entries["gemini-image-generation"].envwith the following keys and values:\rGEMINI_API_KEYrequired\rGEMINI_MODEL_IDrequired\rGEMINI_BASE_URLoptional\r \r- example
~/.openclaw/openclaw.json:\r
{\r
......,\r
"skills": {\r
"entries": {\r
"gemini-image-generation": {\r
"enabled": true,\r
"env": {\r
"GEMINI_API_KEY": "sk-xxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",\r
"GEMINI_MODEL_ID": "gemini-3.1-flash-image-preview",\r
"GEMINI_BASE_URL": "https://custom-endpoint.com"\r
}\r
}\r
}\r
},\r
......\r
}\r
```\r
- Node.js must be installed in the workspace environment.\r
- Install dependencies once with `npm install` from the skill root.\r
\r
## When To Use\r
\r
- The user asks to generate a new image from a text prompt.\r
- The user asks to modify, restyle, extend, or otherwise edit one or more existing images.\r
- The user wants the generated image saved to a workspace file.\r
- The task should be handled through a reusable OpenClaw skill instead of ad hoc SDK code.\r
\r
## Procedure\r
\r
1. Convert the user request into a single clear image prompt.\r
2. If the user supplied source images, choose or confirm the input file path or paths inside the workspace.\r
3. If the user specified a target aspect ratio or size, pass them through as `--aspectRatio` and `--imageSize`.\r
4. Choose an output path inside the workspace unless the user already provided one.\r
5. For text-to-image, run [generate-image.mjs](./scripts/generate-image.mjs) with `--prompt`, `--output`, and optional image config arguments.\r
6. For image editing, run [edit-image.mjs](./scripts/edit-image.mjs) with `--prompt`, one or more `--input` values, `--output`, and optional image config arguments.\r
7. Read the api key from `GEMINI_API_KEY` and the model ID from `GEMINI_MODEL_ID` in the environment.\r
8. Optionally, read the base URL from `GEMINI_BASE_URL` in the environment for custom endpoints.\r
9. Return the saved image path or paths to the user.\r
10. After returning each image path, also output `MEDIA:\x3Cimage_path>` (e.g. `MEDIA:outputs/gemini-native-image.png`) so the image is displayed inline in the conversation.\r
\r
## Commands\r
\r
```powershell\r
node ./skills/gemini-image-generation/scripts/generate-image.mjs --prompt "Create a picture of a nano banana dish in a fancy restaurant with a Gemini theme" --output "outputs/gemini-native-image.png"\r
```\r
\r
```powershell\r
node ./skills/gemini-image-generation/scripts/generate-image.mjs --prompt "Create a wide cinematic food photo of a nano banana dish in a fancy restaurant with a Gemini theme" --output "outputs/gemini-wide.png" --aspectRatio "16:9" --imageSize "2K"\r
```\r
\r
```powershell\r
node ./skills/gemini-image-generation/scripts/edit-image.mjs --prompt "Turn this cat into a watercolor illustration eating a nano-banana in a fancy restaurant under the Gemini constellation" --input "inputs/cat.png" --output "outputs/cat-watercolor.png" --aspectRatio "5:4" --imageSize "2K"\r
```\r
\r
```powershell\r
node ./skills/gemini-image-generation/scripts/edit-image.mjs --prompt "Create an office group photo of these people making funny faces" --input "inputs/person-1.jpg" --input "inputs/person-2.jpg" --input "inputs/person-3.jpg" --output "outputs/group-photo.png"\r
```\r
\r
## Notes\r
\r
- The script prints `TEXT:` lines for model text and `IMAGE:` lines for each saved file.\r
- After the skill finishes, always present every generated image to the user by outputting `MEDIA:\x3Cpath>` for each saved image path. This ensures the image is rendered inline in the conversation alongside the file path.\r
- The final JSON summary only includes generated image paths and optional image config so prompts, model IDs, and source image paths are not echoed back into logs.\r
- Saved file extensions follow the returned image mime type. If the requested output path uses a different suffix, the scripts keep the base name and write the file with the returned type instead.\r
- If the model returns multiple images, the scripts save them as `name-1.png`, `name-2.png`, and so on.\r
- `edit-image.mjs` supports repeated `--input` flags. You can also pass a comma-separated list to a single `--input` value.\r
- `edit-image.mjs` infers the source mime type from `.png`, `.jpg`, `.jpeg`, or `.webp`. Use one `--mime-type` for all inputs, or repeat `--mime-type` so it lines up with each `--input`.\r
- Both scripts accept `--aspectRatio` and `--imageSize`. They also accept the kebab-case forms `--aspect-ratio` and `--image-size`.\r
- The scripts only send `config.imageConfig` when at least one of those parameters is provided.
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install gemini-image-generation - After installation, invoke the skill by name or use
/gemini-image-generation - Provide required inputs per the skill's parameter spec and get structured output
What is gemini-image-generation?
Generate or edit images with Gemini using the Google GenAI SDK. Use when the user asks to create, transform, render, or save one or more images in an OpenCla... It is an AI Agent Skill for Claude Code / OpenClaw, with 626 downloads so far.
How do I install gemini-image-generation?
Run "/install gemini-image-generation" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is gemini-image-generation free?
Yes, gemini-image-generation is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does gemini-image-generation support?
gemini-image-generation is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created gemini-image-generation?
It is built and maintained by Joe (@ztj7728); the current version is v1.0.10.