← 返回 Skills 市场
dyagil

Nano Banana

作者 dyagil · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
36
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install dyagil-nano-banana
功能描述
Generate or edit images using Google's "Nano Banana" image models (Gemini 2.5 / 3.x image previews). Use this when the user explicitly asks for Gemini / Nano...
使用说明 (SKILL.md)

Nano Banana — Gemini Image Generation

A CLI skill for generating and editing images using Google's Gemini image models, marketed as "Nano Banana" and "Nano Banana Pro".

When to Use

Use this skill when the user explicitly asks for:

  • "Nano Banana" / "Nano Banana Pro"
  • "Generate the image with Gemini"
  • Image edits / composition using one or more reference images
  • Iterative edits in conversation ("now make her smile", "swap the background")

For generic "make me an image" requests with no provider hint, prefer your platform's built-in image generation unless the user has set Nano Banana as the default.

Model Map (Nano Banana names → Gemini model IDs)

Nickname Model ID Notes
Nano Banana Pro ← default gemini-3-pro-image-preview Highest quality, slower, multi-image edits
Nano Banana 3.1 Flash gemini-3.1-flash-image-preview Faster, newer, good for iteration
Nano Banana (classic) gemini-2.5-flash-image Original, cheapest, still solid

If unspecified, default to Pro.

Auth

Store the Gemini API key at ~/.openclaw/credentials/google/gemini_api_key (chmod 600).

Load it before invoking the CLI:

export GEMINI_API_KEY=$(cat ~/.openclaw/credentials/google/gemini_api_key)

⚠️ Never log or echo the key. Never paste it into chat. If rotated, replace the file contents.

Get a key at: https://aistudio.google.com/apikey

CLI

Binary: ~/bin/nano-banana (Node.js). Source lives at \x3Cyour-tools-dir>/nano-banana.js.

Generate from prompt

~/bin/nano-banana "a person eating a red apple, photorealistic, warm light"
# Writes: ~/.../nano-banana-output/YYYY-MM-DD_HHMMSS.png
# Prints the absolute path on success.

Choose model / aspect / count

~/bin/nano-banana --model pro "logo for a modern insurance agency, olive green"
~/bin/nano-banana --model flash --aspect 16:9 "morning marathon in a coastal city"
~/bin/nano-banana --model classic --count 4 "red pepper on a wooden table"
~/bin/nano-banana --aspect 1:1 --out /tmp/x.png "..."

Flags:

  • --model pro|flash|classic|\x3Cfull-id> (default: pro)
  • --aspect 1:1|4:3|3:4|16:9|9:16|3:2|2:3 (best-effort prompt hint)
  • --count N (1–4, default 1)
  • --out \x3Cpath> (only valid with --count 1)
  • --out-dir \x3Cpath> (default: \x3Ctools-dir>/nano-banana-output/)
  • --quiet (print only resulting paths, one per line)

Edit with reference images

~/bin/nano-banana --ref photo.jpg "put a red baseball cap on the person"
~/bin/nano-banana --ref a.png --ref b.png "merge the two images, library background"

How Your Agent Should Use This

  1. Read the request and pick a model. "Pro" → pro. "Fast" / "Flash" → flash. Otherwise → pro.
  2. Write the prompt in English. Gemini image models work best in English even when the chat is in another language. Translate any non-English description into a clear, descriptive English prompt.
  3. Run the CLI via exec. Capture the output path.
  4. Deliver the image by adding MEDIA:\x3Cpath> on its own line in the reply (or use your platform's attachment convention).
  5. Reply with a short caption + which model was used.

Example agent flow

User: "Draw a person eating a red apple."

~/bin/nano-banana --aspect 4:3 \
  "A realistic portrait of a person taking a bite from a bright red apple, natural daylight, soft shadows, sharp focus, casual modern clothing, slight motion blur on the hand, warm color grading"
# → /home/\x3Cuser>/.../nano-banana-output/2026-05-11_141815.png

Reply:

Here's the apple shot — Nano Banana Pro MEDIA:/home/\x3Cuser>/.../nano-banana-output/2026-05-11_141815.png

API Contract (what the CLI does)

Under the hood, the CLI POSTs to:

POST https://generativelanguage.googleapis.com/v1beta/models/{model}:generateContent?key={KEY}

Body:

{
  "contents": [{"role": "user", "parts": [{"text": "\x3Cprompt>"}]}],
  "generationConfig": {
    "responseModalities": ["IMAGE", "TEXT"],
    "candidateCount": \x3CN>
  }
}

Response candidates contain parts[].inlineData.{mimeType, data} (base64). The CLI decodes and writes to disk.

For reference images, additional parts with inlineData are prepended before the text prompt.

Errors & Gotchas

  • PERMISSION_DENIED / API_KEY_INVALID → key rotated. Update the credentials file.
  • RESOURCE_EXHAUSTED → free-tier quota hit. Wait, or switch to flash / classic.
  • SAFETY block → response has promptFeedback.blockReason and no image. Rewrite the prompt and retry.
  • No image in response → model returned only text. Usually means the prompt was ambiguous or refused. Add "generate an image of..." explicitly.
  • Aspect ratio is best-effort — Gemini doesn't expose a strict aspectRatio field on image models, so the CLI appends a textual hint.

File Layout

\x3Cyour-skills-dir>/nano-banana/
  SKILL.md                              ← this file
\x3Cyour-tools-dir>/
  nano-banana.js                        ← the CLI implementation
  nano-banana-output/                   ← generated images
~/bin/nano-banana                       ← symlink to nano-banana.js
~/.openclaw/credentials/google/
  gemini_api_key                        ← the secret (chmod 600)
安全使用建议
Before installing, make sure you trust and have reviewed the local nano-banana CLI that this skill will execute. Use a dedicated Gemini API key, keep it chmod 600, and do not send sensitive reference images unless you are comfortable sharing them with Google.
能力标签
requires-sensitive-credentials
能力评估
Purpose & Capability
The stated purpose—generating and editing images with Google Gemini/Nano Banana models—matches the described API use, local image outputs, and optional reference-image inputs.
Instruction Scope
The skill tells the agent to use it when the user explicitly asks for Nano Banana/Gemini image generation and to prefer the platform image tool for generic requests, but it also instructs the agent to run a local CLI via exec.
Install Mechanism
The registry says this is instruction-only with no install spec or code files, while SKILL.md depends on a local Node.js binary/source outside the reviewed artifact set.
Credentials
Using a Gemini API key and sending prompts/reference images to Google is expected for this provider integration, but the registry metadata lists no primary credential or required environment variables.
Persistence & Privilege
The skill stores a Gemini API key under ~/.openclaw and generated images in an output directory; it recommends chmod 600 and does not describe background persistence.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install dyagil-nano-banana
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /dyagil-nano-banana 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release
元数据
Slug dyagil-nano-banana
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

Nano Banana 是什么?

Generate or edit images using Google's "Nano Banana" image models (Gemini 2.5 / 3.x image previews). Use this when the user explicitly asks for Gemini / Nano... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 36 次。

如何安装 Nano Banana?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install dyagil-nano-banana」即可一键安装,无需额外配置。

Nano Banana 是免费的吗?

是的,Nano Banana 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Nano Banana 支持哪些平台?

Nano Banana 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Nano Banana?

由 dyagil(@dyagil)开发并维护,当前版本 v1.0.0。

💬 留言讨论