Description

Generate/edit images with Nano Banana Pro (Gemini 3 Pro Image). Use for image create/modify requests incl. edits. Supports text-to-image + image-to-image; 1K...

README (SKILL.md)

Nano Banana Pro Image Generation & Editing

Name: Embedding Strategies
Author: icesumer-lgtm

Generate new images or edit existing ones using Google's Nano Banana Pro API (Gemini 3 Pro Image).

Usage

Run the script using absolute path (do NOT cd to skill directory first):

Generate new image:

uv run ~/.codex/skills/nano-banana-pro/scripts/generate_image.py --prompt "your image description" --filename "output-name.png" [--resolution 1K|2K|4K] [--api-key KEY]

Edit existing image:

uv run ~/.codex/skills/nano-banana-pro/scripts/generate_image.py --prompt "editing instructions" --filename "output-name.png" --input-image "path/to/input.png" [--resolution 1K|2K|4K] [--api-key KEY]

Important: Always run from the user's current working directory so images are saved where the user is working, not in the skill directory.

Default Workflow (draft → iterate → final)

Goal: fast iteration without burning time on 4K until the prompt is correct.

Draft (1K): quick feedback loop
- uv run ~/.codex/skills/nano-banana-pro/scripts/generate_image.py --prompt "\x3Cdraft prompt>" --filename "yyyy-mm-dd-hh-mm-ss-draft.png" --resolution 1K
Iterate: adjust prompt in small diffs; keep filename new per run
- If editing: keep the same --input-image for every iteration until you’re happy.
Final (4K): only when prompt is locked
- uv run ~/.codex/skills/nano-banana-pro/scripts/generate_image.py --prompt "\x3Cfinal prompt>" --filename "yyyy-mm-dd-hh-mm-ss-final.png" --resolution 4K

Resolution Options

The Gemini 3 Pro Image API supports three resolutions (uppercase K required):

1K (default) - ~1024px resolution
2K - ~2048px resolution
4K - ~4096px resolution

Map user requests to API parameters:

No mention of resolution → 1K
"low resolution", "1080", "1080p", "1K" → 1K
"2K", "2048", "normal", "medium resolution" → 2K
"high resolution", "high-res", "hi-res", "4K", "ultra" → 4K

API Key

The script checks for API key in this order:

--api-key argument (use if user provided key in chat)
GEMINI_API_KEY environment variable

If neither is available, the script exits with an error message.

Preflight + Common Failures (fast fixes)

Preflight:
- command -v uv (must exist)
- test -n \"$GEMINI_API_KEY\" (or pass --api-key)
- If editing: test -f \"path/to/input.png\"
Common failures:
- Error: No API key provided. → set GEMINI_API_KEY or pass --api-key
- Error loading input image: → wrong path / unreadable file; verify --input-image points to a real image
- “quota/permission/403” style API errors → wrong key, no access, or quota exceeded; try a different key/account

Filename Generation

Generate filenames with the pattern: yyyy-mm-dd-hh-mm-ss-name.png

Format: {timestamp}-{descriptive-name}.png

Timestamp: Current date/time in format yyyy-mm-dd-hh-mm-ss (24-hour format)
Name: Descriptive lowercase text with hyphens
Keep the descriptive part concise (1-5 words typically)
Use context from user's prompt or conversation
If unclear, use random identifier (e.g., x9k2, a7b3)

Examples:

Prompt "A serene Japanese garden" → 2025-11-23-14-23-05-japanese-garden.png
Prompt "sunset over mountains" → 2025-11-23-15-30-12-sunset-mountains.png
Prompt "create an image of a robot" → 2025-11-23-16-45-33-robot.png
Unclear context → 2025-11-23-17-12-48-x9k2.png

Image Editing

When the user wants to modify an existing image:

Check if they provide an image path or reference an image in the current directory
Use --input-image parameter with the path to the image
The prompt should contain editing instructions (e.g., "make the sky more dramatic", "remove the person", "change to cartoon style")
Common editing tasks: add/remove elements, change style, adjust colors, blur background, etc.

Prompt Handling

For generation: Pass user's image description as-is to --prompt. Only rework if clearly insufficient.

For editing: Pass editing instructions in --prompt (e.g., "add a rainbow in the sky", "make it look like a watercolor painting")

Preserve user's creative intent in both cases.

Prompt Templates (high hit-rate)

Use templates when the user is vague or when edits must be precise.

Generation template:
- “Create an image of: \x3Csubject>. Style: \x3Cstyle>. Composition: \x3Ccamera/shot>. Lighting: \x3Clighting>. Background: \x3Cbackground>. Color palette: \x3Cpalette>. Avoid: \x3Clist>.”
Editing template (preserve everything else):
- “Change ONLY: \x3Csingle change>. Keep identical: subject, composition/crop, pose, lighting, color palette, background, text, and overall style. Do not add new objects. If text exists, keep it unchanged.”

Output

Saves PNG to current directory (or specified path if filename includes directory)
Script outputs the full path to the generated image
Do not read the image back - just inform the user of the saved path

Examples

Generate new image:

uv run ~/.codex/skills/nano-banana-pro/scripts/generate_image.py --prompt "A serene Japanese garden with cherry blossoms" --filename "2025-11-23-14-23-05-japanese-garden.png" --resolution 4K

Edit existing image:

uv run ~/.codex/skills/nano-banana-pro/scripts/generate_image.py --prompt "make the sky more dramatic with storm clouds" --filename "2025-11-23-14-25-30-dramatic-sky.png" --input-image "original-photo.jpg" --resolution 2K

Usage Guidance

Plain-language checklist before installing or running this skill: - Do not run the skill's script blindly. Inspect scripts/generate_image.py for outgoing network calls, hardcoded endpoints, or embedded credentials before executing. - The SKILL.md expects GEMINI_API_KEY (or --api-key) but the skill metadata lists no required env vars — this is inconsistent. Provide a dedicated, minimal API key (preferably short-lived) if you proceed. - The package includes many unrelated files and several config dumps that embed API keys, app secrets, and tokens (e.g., feishu app secrets, gateway tokens). Treat this as sensitive: do not share or allow the skill to access those files. Remove or sanitize unrelated files before use. - The repository appears to be a full workspace snapshot rather than a standalone skill. Ask the author for a minimal package that contains only the image script and documentation, or extract only the generate_image.py and its minimal dependencies. - Because pre-scan found prompt-injection and obfuscation patterns, search the package for base64 blobs, unicode control characters, or instructions that try to override agent policies. If found, do not enable autonomous execution until resolved. - If you must test, run in an isolated environment (sandbox / container / VM) with no access to your real credentials or sensitive local files, and monitor network traffic. If you want, I can: - Inspect the generate_image.py source for network endpoints and suspicious code (if you provide its contents), - Scan the repo for hardcoded secrets and list the files that contain them, - Produce a minimal safe package (extract necessary files) or a checklist to sanitize this skill before installation.

Capability Analysis

Type: OpenClaw Skill Name: embedding-strategies Version: 1.0.0 The skill bundle is a highly customized environment containing numerous scripts and configuration files that exhibit significant security vulnerabilities. The most critical indicator is the extensive presence of hardcoded sensitive credentials, including Aliyun API keys and Feishu (Lark) APP_SECRET tokens, found in files such as 'fetch_feishu_docs.py', 'scripts/debug-search-step.py', 'openclaw.json', and '2026-3-10afu的js备份.txt'. Additionally, several scripts ('hooks/gateway-restart-protection/handler.js' and 'scripts/triple-line-sync.js') utilize 'execSync' for shell command execution (e.g., running 'robocopy' or 'Start-Process'), which poses a high risk of command injection. While these capabilities appear intended for the author's personal automation and RAG (Retrieval-Augmented Generation) workflows, the lack of proper secret management and the use of powerful system calls make the bundle inherently risky for general use.

Capability Assessment

⚠ Purpose & Capability

SKILL.md describes an image-generation/editing helper that uses an API key (GEMINI_API_KEY or --api-key). However the registry metadata declares no required env vars/credentials. The repository contains the expected generate_image.py, but it also bundles a large unrelated workspace (hundreds of files) including multiple service credentials and platform config files — far beyond what an image helper legitimately needs.

⚠ Instruction Scope

The runtime instructions are narrowly scoped to invoking the generate_image.py script and passing an API key or using GEMINI_API_KEY. However the package contains many other documents (AGENTS.md, MEMORY.md, config dumps) that instruct agents to read broad workspace context and files. That expands the effective scope if the skill or agent uses other files in the bundle. Also pre-scan flagged prompt-injection patterns in SKILL.md content, which could attempt to manipulate an agent's behavior.

ℹ Install Mechanism

There is no formal install spec (instruction-only), which minimizes automatic installation risk. But the artifact nonetheless includes 93 code files and a 615-file manifest (full workspace). That indicates a packaged workspace rather than a minimal skill; running the provided script will execute code from that package — inspect code before executing.

⚠ Credentials

SKILL.md explicitly relies on GEMINI_API_KEY (or --api-key) but the skill metadata lists no required env vars. More seriously, several files in the bundle (e.g., the 2026-3-10afu's js backup and other config files) contain many API keys, app secrets, tokens and gateway auth values unrelated to image generation. Packaging unrelated secrets with a skill is a high-risk mismatch.

ℹ Persistence & Privilege

The skill is not flagged as always:true and uses normal autonomous invocation defaults. That is expected. However the included workspace files contain agent policies (AGENTS.md) that encourage reading many local files (e.g., 'read MEMORY.md', 'read SOUL.md') which, combined with autonomous invocation, increases potential blast radius if the agent follows those broader instructions. No explicit attempt to modify other skills or system-wide settings was observed in the provided SKILL.md.

Version History

v1.0.0

- Initial release of nano-banana-pro image generation and editing skill using Gemini 3 Pro Image API. - Supports both text-to-image and image-to-image workflows at 1K, 2K, and 4K resolutions. - Detailed usage instructions for generating or editing images via command line, including API key handling. - Includes best practices for iterative workflow: quick 1K drafts, precise prompt edits, and final 4K outputs. - Automatic, context-aware filename generation with timestamp and description. - Covers preflight checks, common errors, prompt handling, editing guidance, and precise prompt templates.

Metadata

Slug embedding-strategies

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Embedding Strategies?

Generate/edit images with Nano Banana Pro (Gemini 3 Pro Image). Use for image create/modify requests incl. edits. Supports text-to-image + image-to-image; 1K... It is an AI Agent Skill for Claude Code / OpenClaw, with 556 downloads so far.

How do I install Embedding Strategies?

Run "/install embedding-strategies" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Embedding Strategies free?

Yes, Embedding Strategies is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Embedding Strategies support?

Embedding Strategies is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Embedding Strategies?

It is built and maintained by icesumer-lgtm (@icesumer-lgtm); the current version is v1.0.0.

More Skills

Embedding Strategies