GPT Image 2 API
/install gpt-image-2-api
gpt-image-2
Generate and edit images via OpenAI's gpt-image-2 model. Agent-agnostic — designed to work with any AI agent or standalone from the command line.
Quick Start
# 1. Initialize config (one-time)
python3 gpt_image2.py config --init
# 2. Edit the config to set your API key
# ~/.config/gpt-image-2/config.json
# 3. Generate
python3 gpt_image2.py generate "A cute cat on a windowsill" -o ~/cat.png --quality low
# 4. Edit
python3 gpt_image2.py edit input.png "Change the sofa to green" -o ~/output.png
Configuration
Config priority: --config flag > --base-url/--api-key flags > config file > environment variables > defaults.
Config File Locations (in priority order)
| Priority | Path | Notes |
|---|---|---|
| 1 | $XDG_CONFIG_HOME/gpt-image-2/config.json |
XDG standard (recommended) |
| 2 | ~/.config/gpt-image-2/config.json |
Default XDG fallback |
| 3 | ~/.gpt-image-2-config.json |
Single-file fallback |
| 4 | ~/.hermes/gpt-image-2-config.json |
Legacy Hermes compat |
Use python3 gpt_image2.py config --show to see which config is active.
Config File Format
{
"base_url": "https://api.openai.com/v1",
"api_key_env": "OPENAI_API_KEY"
}
| Field | Type | Description |
|---|---|---|
base_url |
string | API base URL. Default: https://api.openai.com/v1 |
api_key |
string | Plaintext API key (not recommended — visible in file) |
api_key_env |
string | Environment variable name holding the key (recommended) |
Environment Variables (fallback when no config file)
| Variable | Purpose |
|---|---|
GPT_IMAGE2_API_KEY |
API key |
GPT_IMAGE2_BASE_URL |
API base URL |
Config Management Commands
# Create template config
python3 gpt_image2.py config --init
# Show active config (keys are masked)
python3 gpt_image2.py config --show
# Overwrite config
python3 gpt_image2.py config --init --force
CLI Reference
generate — Text-to-Image
python3 gpt_image2.py generate "prompt" [options]
| Option | Default | Description |
|---|---|---|
-o, --output |
~/gpt-image2-output.png |
Output file path |
--quality |
auto |
low (~70s), medium (~120s), high (~276s) |
--size |
auto |
1024x1024, 1536x1024, 1024x1536 |
--format |
png |
png, jpeg, webp |
--n |
1 |
Number of images (1-10) |
--timeout |
600 |
curl timeout in seconds |
--config |
auto-detect | Explicit config file path |
--base-url |
from config | Override API base URL |
--api-key |
from config | Override API key (visible in ps!) |
edit — Image-to-Image
python3 gpt_image2.py edit \x3Cimage_path> "edit prompt" [options]
| Option | Default | Description |
|---|---|---|
--mask |
none | PNG mask (transparent=edit area) |
--moderation |
auto |
low or auto |
| (all generate options also apply) |
config — Manage Configuration
python3 gpt_image2.py config [--init] [--show] [--force] [--config PATH]
Script Location
The script is at scripts/gpt_image2.py relative to this skill directory.
To find it programmatically from any agent:
# If installed as a Hermes skill:
SCRIPT="$(dirname "$(readlink -f "$0")")/../skills/creative/gpt-image-2/scripts/gpt_image2.py"
# Or copy/symlink it anywhere — it's self-contained with zero dependencies beyond stdlib + curl
cp scripts/gpt_image2.py /usr/local/bin/gpt-image2
The script has zero pip dependencies — only Python 3.8+ stdlib and curl.
API Reference
Generations (Text-to-Image)
| Item | Value |
|---|---|
| Endpoint | POST {base_url}/images/generations |
| Auth | Authorization: Bearer {api_key} |
| Content-Type | application/json |
Edits (Image-to-Image)
| Item | Value |
|---|---|
| Endpoint | POST {base_url}/images/edits |
| Auth | Authorization: Bearer {api_key} |
| Content-Type | multipart/form-data |
Parameters
Generations (JSON body):
| Param | Type | Required | Description |
|---|---|---|---|
model |
string | yes | gpt-image-2 |
prompt |
string | yes | Text description |
n |
int | no | Number of images (default 1) |
size |
string | no | 1024x1024, 1536x1024, 1024x1536 |
quality |
string | no | low, medium, high (default auto) |
format |
string | no | png, jpg, webp (default png) |
Edits (form-data):
| Param | Type | Required | Description |
|---|---|---|---|
model |
string | yes | gpt-image-2 |
prompt |
string | yes | Edit instruction |
image |
file | yes | Source image (PNG, max 4 images) |
n |
int | no | Number of outputs (default 1) |
size |
string | no | 1024x1024, 1536x1024, 1024x1536, or auto |
quality |
string | no | low, medium, high (default auto) |
Agent Integration Guide
This skill is designed to be agent-agnostic. Any AI agent can use it by:
- Locate the script: Find
gpt_image2.pyin the skill'sscripts/directory - Call via shell:
python3 \x3Cpath>/gpt_image2.py generate "prompt" -o output.png - Parse stdout: The script prints
Saved: \x3Cpath> (\x3Csize> KB)on success
Integration Examples
Hermes / Claude Code / Codex / OpenClaw:
python3 /path/to/gpt-image-2/scripts/gpt_image2.py generate "prompt" -o output.png --quality low
From Python (any agent):
import subprocess, json
result = subprocess.run(
["python3", script_path, "generate", prompt, "-o", output_path, "--quality", "low"],
capture_output=True, text=True, timeout=600
)
# Parse result.stdout for "Saved: \x3Cpath>"
From Node.js / TypeScript:
const { execSync } = require('child_process');
const output = execSync(`python3 ${scriptPath} generate "${prompt}" -o ${outputPath}`);
// Parse output.toString() for "Saved: ..."
Workflow: Agent Generates Images
- Always use the CLI script — handles config resolution, auth security, and response parsing
- Use low quality for drafts, high quality for final output
- For edits:
--size autopreserves original dimensions (recommended) - The script outputs: HTTP status, time elapsed, output file path and size
- Parse the output: look for
Saved: \x3Cpath>lines to find generated files
Workflow: Agent Edits Existing Images
- Save or locate the source image path
- Call
gpt_image2.py edit \x3Cimage_path> "\x3Cedit_prompt>" --output \x3Coutput_path> - Edit endpoint can accept up to 4 images via repeated
--imageflags - Use
--size autoto preserve original dimensions
Important Pitfalls
--api-keyflag is visible in shell history andps aux— prefer config file (api_key_env) or environment variables.- The edits endpoint does NOT support
response_format— always returns b64_json regardless. - gpt-image-2 generations may time out on some relay endpoints — use
--timeoutflag (default 600s). - Prompt with special characters — the script writes prompts to temp files internally, avoiding shell escaping issues. No need to worry about quoting.
- Authorization header is never passed via
-H— the script uses curl-Ktemp config file, deleted immediately after use. Keys never appear inps aux. - Config file permissions — the script warns if config has group/other read permissions. Run
chmod 600 \x3Cconfig>to fix. - Zero pip dependencies — the script only requires Python 3.8+ stdlib and
curl. No installation step needed. - Chinese text in prompts may not render correctly — gpt-image-2's Chinese rendering is unstable; it often ignores Chinese constraints and outputs English text in images. Consider using Gemini for Chinese text rendering.
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install gpt-image-2-api - After installation, invoke the skill by name or use
/gpt-image-2-api - Provide required inputs per the skill's parameter spec and get structured output
What is GPT Image 2 API?
Generate and edit images via OpenAI gpt-image-2 model. Agent-agnostic CLI — works with any AI agent (Hermes, Claude Code, Codex, OpenClaw, etc.). Supports co... It is an AI Agent Skill for Claude Code / OpenClaw, with 84 downloads so far.
How do I install GPT Image 2 API?
Run "/install gpt-image-2-api" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is GPT Image 2 API free?
Yes, GPT Image 2 API is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does GPT Image 2 API support?
GPT Image 2 API is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created GPT Image 2 API?
It is built and maintained by Cong Pendy (@jancong); the current version is v2.0.0.