功能描述

Generate or edit images with Gemini using the Google GenAI SDK. Use when the user asks to create, transform, render, or save one or more images in an OpenCla...

使用说明 (SKILL.md)

\r \r

Image Generation\r

Name: gemini-image-generation
Author: ztj7728

\r Use this skill when you need to create one or more image files from a text prompt, or edit one or more existing images with Gemini.\r \r

Requirements\r

\r \r

~/.openclaw/openclaw.json must include $.skills.entries["gemini-image-generation"].enabled set to true.\r
~/.openclaw/openclaw.json must include $.skills.entries["gemini-image-generation"].env with the following keys and values:\r
GEMINI_API_KEY required\r
GEMINI_MODEL_ID required\r
GEMINI_BASE_URL optional\r \r
example ~/.openclaw/openclaw.json:\r

{\r
  ......,\r
  "skills": {\r
    "entries": {\r
      "gemini-image-generation": {\r
        "enabled": true,\r
        "env": {\r
          "GEMINI_API_KEY": "sk-xxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",\r
          "GEMINI_MODEL_ID": "gemini-3.1-flash-image-preview",\r
          "GEMINI_BASE_URL": "https://custom-endpoint.com"\r
        }\r
      }\r
    }\r
  },\r
  ......\r
}\r
```\r
- Node.js must be installed in the workspace environment.\r
- Install dependencies once with `npm install` from the skill root.\r
\r
## When To Use\r
\r
- The user asks to generate a new image from a text prompt.\r
- The user asks to modify, restyle, extend, or otherwise edit one or more existing images.\r
- The user wants the generated image saved to a workspace file.\r
- The task should be handled through a reusable OpenClaw skill instead of ad hoc SDK code.\r
\r
## Procedure\r
\r
1. Convert the user request into a single clear image prompt.\r
2. If the user supplied source images, choose or confirm the input file path or paths inside the workspace.\r
3. If the user specified a target aspect ratio or size, pass them through as `--aspectRatio` and `--imageSize`.\r
4. Choose an output path inside the workspace unless the user already provided one.\r
5. For text-to-image, run [generate-image.mjs](./scripts/generate-image.mjs) with `--prompt`, `--output`, and optional image config arguments.\r
6. For image editing, run [edit-image.mjs](./scripts/edit-image.mjs) with `--prompt`, one or more `--input` values, `--output`, and optional image config arguments.\r
7. Read the api key from `GEMINI_API_KEY` and the model ID from `GEMINI_MODEL_ID` in the environment.\r
8. Optionally, read the base URL from `GEMINI_BASE_URL` in the environment for custom endpoints.\r
9. Return the saved image path or paths to the user.\r
10. After returning each image path, also output `MEDIA:\x3Cimage_path>` (e.g. `MEDIA:outputs/gemini-native-image.png`) so the image is displayed inline in the conversation.\r
\r
## Commands\r
\r
```powershell\r
node ./skills/gemini-image-generation/scripts/generate-image.mjs --prompt "Create a picture of a nano banana dish in a fancy restaurant with a Gemini theme" --output "outputs/gemini-native-image.png"\r
```\r
\r
```powershell\r
node ./skills/gemini-image-generation/scripts/generate-image.mjs --prompt "Create a wide cinematic food photo of a nano banana dish in a fancy restaurant with a Gemini theme" --output "outputs/gemini-wide.png" --aspectRatio "16:9" --imageSize "2K"\r
```\r
\r
```powershell\r
node ./skills/gemini-image-generation/scripts/edit-image.mjs --prompt "Turn this cat into a watercolor illustration eating a nano-banana in a fancy restaurant under the Gemini constellation" --input "inputs/cat.png" --output "outputs/cat-watercolor.png" --aspectRatio "5:4" --imageSize "2K"\r
```\r
\r
```powershell\r
node ./skills/gemini-image-generation/scripts/edit-image.mjs --prompt "Create an office group photo of these people making funny faces" --input "inputs/person-1.jpg" --input "inputs/person-2.jpg" --input "inputs/person-3.jpg" --output "outputs/group-photo.png"\r
```\r
\r
## Notes\r
\r
- The script prints `TEXT:` lines for model text and `IMAGE:` lines for each saved file.\r
- After the skill finishes, always present every generated image to the user by outputting `MEDIA:\x3Cpath>` for each saved image path. This ensures the image is rendered inline in the conversation alongside the file path.\r
- The final JSON summary only includes generated image paths and optional image config so prompts, model IDs, and source image paths are not echoed back into logs.\r
- Saved file extensions follow the returned image mime type. If the requested output path uses a different suffix, the scripts keep the base name and write the file with the returned type instead.\r
- If the model returns multiple images, the scripts save them as `name-1.png`, `name-2.png`, and so on.\r
- `edit-image.mjs` supports repeated `--input` flags. You can also pass a comma-separated list to a single `--input` value.\r
- `edit-image.mjs` infers the source mime type from `.png`, `.jpg`, `.jpeg`, or `.webp`. Use one `--mime-type` for all inputs, or repeat `--mime-type` so it lines up with each `--input`.\r
- Both scripts accept `--aspectRatio` and `--imageSize`. They also accept the kebab-case forms `--aspect-ratio` and `--image-size`.\r
- The scripts only send `config.imageConfig` when at least one of those parameters is provided.

安全使用建议

This skill appears coherent and implements image generation/editing via Google GenAI. Before installing: 1) Only enable it if you trust the skill source and are comfortable sending prompts and any source images to Gemini (the skill base64-encodes and uploads input images to the API). 2) Keep GEMINI_API_KEY secret (store it in your OpenClaw skill config as instructed). 3) If you use GEMINI_BASE_URL, ensure it points to a trusted endpoint (a custom base URL could redirect requests to a non-Google host). 4) Run 'npm install' in the skill directory to install @google/genai, and review that dependency if you have concerns. 5) Be mindful of privacy: do not send PII or sensitive images unless you accept they will be processed by the configured GenAI endpoint.

功能分析

Type: OpenClaw Skill Name: gemini-image-generation Version: 1.0.10 The skill provides image generation and editing capabilities but includes high-risk behaviors and suspicious formatting. It allows arbitrary file read and write access via the `--input` and `--output` arguments in `edit-image.mjs` and `generate-image.mjs`, which lacks path sanitization. Furthermore, it supports a `GEMINI_BASE_URL` environment variable that can redirect sensitive data, including the `GEMINI_API_KEY` and local file content, to an arbitrary external endpoint. The `README.md` file also utilizes character-level spacing obfuscation, a common technique for evading simple static analysis filters.

能力评估

✓ Purpose & Capability

Name/description, required binaries (node, npm), and required env vars (GEMINI_API_KEY, GEMINI_MODEL_ID) align with the declared purpose of calling Google GenAI (Gemini) to generate/edit images. The package.json depends on @google/genai which is appropriate for this functionality.

✓ Instruction Scope

SKILL.md and the scripts only instruct reading workspace image files, reading GEMINI_* environment variables, invoking the GoogleGenAI client, and saving returned images to workspace. There are no instructions to read unrelated system files, other credentials, or to send data to unexpected endpoints. The skill will of course transmit prompts and any provided source images to the Gemini API (expected for image editing).

ℹ Install Mechanism

No formal install spec is included (instruction-only install), but package.json and SKILL.md instruct the user to run 'npm install' in the skill root. This is expected for a Node-based skill; there is no third-party binary download or untrusted URL referenced.

✓ Credentials

Requested env vars are limited and appropriate: GEMINI_API_KEY (primary) and GEMINI_MODEL_ID are required; GEMINI_BASE_URL is optional for custom endpoints. No unrelated credentials or broad system config paths are requested.

✓ Persistence & Privilege

The skill does not request always:true, does not modify other skills, and requires explicit enabling in ~/.openclaw/openclaw.json. Autonomous invocation is allowed (platform default) but not combined with elevated persistence or unrelated credential access.

版本历史

v1.0.10

No user-facing changes in this release. - Version bump to 1.0.10 with no file modifications or content updates detected.

v1.0.9

- Added instructions to output MEDIA:<image_path> for each generated image so they appear inline in conversations. - Clarified that every generated image path must be followed by a MEDIA line after task completion. - Updated example commands to reflect the new folder structure (`./skills/gemini-image-generation/`). - No code changes; SKILL.md documentation updated for improved user experience.

v1.0.8

- Clarified skill activation requirements: now documents explicit settings needed in ~/.openclaw/openclaw.json. - Added example configuration block for enabling the skill and setting required environment variables. - Installation instructions updated to specify running npm install from the skill root. - No changes to core commands, features, or supported parameters. - General documentation clarification and improved onboarding steps.

v1.0.7

- Added new script: scripts/gemini-image-runtime.mjs. - File extension of saved images now follows the returned image mime type; output uses the requested base name but will match returned type. - Minor update to documentation to clarify file extension handling.

v1.0.6

- Added npm as a required binary for the skill environment. - Updated documentation metadata to specify both node and npm as required.

v1.0.5

- Updated metadata to move optionalEnv into the openclaw object and adjust its format. - No changes to code or functionality.

v1.0.4

- Declared GEMINI_BASE_URL as optional in the skill metadata and clarified its usage in the environment. - Moved GEMINI_BASE_URL from required to optional environment variables in the OpenClaw metadata. - No functional changes to the skill's behavior.

v1.0.3

Initial release providing Gemini-based image generation and editing. - Added scripts for generating images from text prompts and editing images using Gemini (generate-image.mjs, edit-image.mjs). - Includes documentation in README.md and usage guidelines in SKILL.md. - Supports options for custom aspect ratio, image size, and multiple inputs. - Outputs saved images and a JSON summary with paths and config details. - Environment variables required: GEMINI_API_KEY, GEMINI_MODEL_ID, and optional GEMINI_BASE_URL.

v1.0.2

- Removed sample scripts and documentation files: README.md, package.json, scripts/edit-image.mjs, scripts/generate-image.mjs. - The skill now provides only metadata and usage documentation; no execution scripts or supporting files are included. - Metadata updated in SKILL.md to new OpenClaw format with emoji and revised environment variable requirements.

v1.0.1

- Added a metadata block detailing environment and dependency requirements. - The final JSON summary now includes only generated image paths and optional image config, omitting prompts, model IDs, and source image paths from logs.

v1.0.0

- Initial release of the Gemini image generation skill. - Supports generating images from text prompts and editing existing images using the Google GenAI SDK. - Handles multiple input images, aspect ratio, and image size options. - Saves generated or edited images to specified output paths. - Requires environment variables for API key and model ID.

元数据

Slug gemini-image-generation

版本 1.0.10

许可证 MIT-0

累计安装 3

当前安装数 3

历史版本数 11

常见问题

gemini-image-generation 是什么？

Generate or edit images with Gemini using the Google GenAI SDK. Use when the user asks to create, transform, render, or save one or more images in an OpenCla... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 626 次。

如何安装 gemini-image-generation？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install gemini-image-generation」即可一键安装，无需额外配置。

gemini-image-generation 是免费的吗？

是的，gemini-image-generation 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

gemini-image-generation 支持哪些平台？

gemini-image-generation 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 gemini-image-generation？

由 Joe（@ztj7728）开发并维护，当前版本 v1.0.10。

gemini-image-generation