← Back to Skills Marketplace
jancong

GPT Image 2 API

by Cong Pendy · GitHub ↗ · v2.0.0 · MIT-0
cross-platform ⚠ suspicious
84
Downloads
0
Stars
0
Active Installs
2
Versions
Install in OpenClaw
/install gpt-image-2-api
Description
Generate and edit images via OpenAI gpt-image-2 model. Agent-agnostic CLI — works with any AI agent (Hermes, Claude Code, Codex, OpenClaw, etc.). Supports co...
README (SKILL.md)

gpt-image-2

Generate and edit images via OpenAI's gpt-image-2 model. Agent-agnostic — designed to work with any AI agent or standalone from the command line.

Quick Start

# 1. Initialize config (one-time)
python3 gpt_image2.py config --init

# 2. Edit the config to set your API key
#    ~/.config/gpt-image-2/config.json

# 3. Generate
python3 gpt_image2.py generate "A cute cat on a windowsill" -o ~/cat.png --quality low

# 4. Edit
python3 gpt_image2.py edit input.png "Change the sofa to green" -o ~/output.png

Configuration

Config priority: --config flag > --base-url/--api-key flags > config file > environment variables > defaults.

Config File Locations (in priority order)

Priority Path Notes
1 $XDG_CONFIG_HOME/gpt-image-2/config.json XDG standard (recommended)
2 ~/.config/gpt-image-2/config.json Default XDG fallback
3 ~/.gpt-image-2-config.json Single-file fallback
4 ~/.hermes/gpt-image-2-config.json Legacy Hermes compat

Use python3 gpt_image2.py config --show to see which config is active.

Config File Format

{
  "base_url": "https://api.openai.com/v1",
  "api_key_env": "OPENAI_API_KEY"
}
Field Type Description
base_url string API base URL. Default: https://api.openai.com/v1
api_key string Plaintext API key (not recommended — visible in file)
api_key_env string Environment variable name holding the key (recommended)

Environment Variables (fallback when no config file)

Variable Purpose
GPT_IMAGE2_API_KEY API key
GPT_IMAGE2_BASE_URL API base URL

Config Management Commands

# Create template config
python3 gpt_image2.py config --init

# Show active config (keys are masked)
python3 gpt_image2.py config --show

# Overwrite config
python3 gpt_image2.py config --init --force

CLI Reference

generate — Text-to-Image

python3 gpt_image2.py generate "prompt" [options]
Option Default Description
-o, --output ~/gpt-image2-output.png Output file path
--quality auto low (~70s), medium (~120s), high (~276s)
--size auto 1024x1024, 1536x1024, 1024x1536
--format png png, jpeg, webp
--n 1 Number of images (1-10)
--timeout 600 curl timeout in seconds
--config auto-detect Explicit config file path
--base-url from config Override API base URL
--api-key from config Override API key (visible in ps!)

edit — Image-to-Image

python3 gpt_image2.py edit \x3Cimage_path> "edit prompt" [options]
Option Default Description
--mask none PNG mask (transparent=edit area)
--moderation auto low or auto
(all generate options also apply)

config — Manage Configuration

python3 gpt_image2.py config [--init] [--show] [--force] [--config PATH]

Script Location

The script is at scripts/gpt_image2.py relative to this skill directory.

To find it programmatically from any agent:

# If installed as a Hermes skill:
SCRIPT="$(dirname "$(readlink -f "$0")")/../skills/creative/gpt-image-2/scripts/gpt_image2.py"

# Or copy/symlink it anywhere — it's self-contained with zero dependencies beyond stdlib + curl
cp scripts/gpt_image2.py /usr/local/bin/gpt-image2

The script has zero pip dependencies — only Python 3.8+ stdlib and curl.

API Reference

Generations (Text-to-Image)

Item Value
Endpoint POST {base_url}/images/generations
Auth Authorization: Bearer {api_key}
Content-Type application/json

Edits (Image-to-Image)

Item Value
Endpoint POST {base_url}/images/edits
Auth Authorization: Bearer {api_key}
Content-Type multipart/form-data

Parameters

Generations (JSON body):

Param Type Required Description
model string yes gpt-image-2
prompt string yes Text description
n int no Number of images (default 1)
size string no 1024x1024, 1536x1024, 1024x1536
quality string no low, medium, high (default auto)
format string no png, jpg, webp (default png)

Edits (form-data):

Param Type Required Description
model string yes gpt-image-2
prompt string yes Edit instruction
image file yes Source image (PNG, max 4 images)
n int no Number of outputs (default 1)
size string no 1024x1024, 1536x1024, 1024x1536, or auto
quality string no low, medium, high (default auto)

Agent Integration Guide

This skill is designed to be agent-agnostic. Any AI agent can use it by:

  1. Locate the script: Find gpt_image2.py in the skill's scripts/ directory
  2. Call via shell: python3 \x3Cpath>/gpt_image2.py generate "prompt" -o output.png
  3. Parse stdout: The script prints Saved: \x3Cpath> (\x3Csize> KB) on success

Integration Examples

Hermes / Claude Code / Codex / OpenClaw:

python3 /path/to/gpt-image-2/scripts/gpt_image2.py generate "prompt" -o output.png --quality low

From Python (any agent):

import subprocess, json
result = subprocess.run(
    ["python3", script_path, "generate", prompt, "-o", output_path, "--quality", "low"],
    capture_output=True, text=True, timeout=600
)
# Parse result.stdout for "Saved: \x3Cpath>"

From Node.js / TypeScript:

const { execSync } = require('child_process');
const output = execSync(`python3 ${scriptPath} generate "${prompt}" -o ${outputPath}`);
// Parse output.toString() for "Saved: ..."

Workflow: Agent Generates Images

  1. Always use the CLI script — handles config resolution, auth security, and response parsing
  2. Use low quality for drafts, high quality for final output
  3. For edits: --size auto preserves original dimensions (recommended)
  4. The script outputs: HTTP status, time elapsed, output file path and size
  5. Parse the output: look for Saved: \x3Cpath> lines to find generated files

Workflow: Agent Edits Existing Images

  1. Save or locate the source image path
  2. Call gpt_image2.py edit \x3Cimage_path> "\x3Cedit_prompt>" --output \x3Coutput_path>
  3. Edit endpoint can accept up to 4 images via repeated --image flags
  4. Use --size auto to preserve original dimensions

Important Pitfalls

  1. --api-key flag is visible in shell history and ps aux — prefer config file (api_key_env) or environment variables.
  2. The edits endpoint does NOT support response_format — always returns b64_json regardless.
  3. gpt-image-2 generations may time out on some relay endpoints — use --timeout flag (default 600s).
  4. Prompt with special characters — the script writes prompts to temp files internally, avoiding shell escaping issues. No need to worry about quoting.
  5. Authorization header is never passed via -H — the script uses curl -K temp config file, deleted immediately after use. Keys never appear in ps aux.
  6. Config file permissions — the script warns if config has group/other read permissions. Run chmod 600 \x3Cconfig> to fix.
  7. Zero pip dependencies — the script only requires Python 3.8+ stdlib and curl. No installation step needed.
  8. Chinese text in prompts may not render correctly — gpt-image-2's Chinese rendering is unstable; it often ignores Chinese constraints and outputs English text in images. Consider using Gemini for Chinese text rendering.
Usage Guidance
This script is generally coherent with its stated purpose, but review these before installing or using it: (1) The script can read config files (~/.config..., ~/.gpt-image-2-config.json, and legacy ~/.hermes/*) and can read an environment variable whose name is configured—so only put API keys you trust the tool to use. (2) The --base-url option and config.base_url let callers send requests to any HTTP endpoint; only use a base_url you trust (the default is api.openai.com). Malicious base_url settings would cause your API key to be sent to a third party. (3) Avoid storing sensitive keys in plaintext config files; prefer a restricted environment variable and secure file permissions (the tool warns about loose perms). (4) Because the registry metadata lists no required env vars even though the tool documents them, verify the key name and config before handing over credentials. If you don't trust the skill source, inspect the full script locally (especially where base_url and api_key are read and used) and restrict file permissions if you save config files.
Capability Analysis
Type: OpenClaw Skill Name: gpt-image-2-api Version: 2.0.0 The gpt-image-2 skill is a well-implemented utility for image generation and editing via OpenAI-compatible APIs. The core script (scripts/gpt_image2.py) demonstrates security-conscious design by using temporary configuration files for curl headers to prevent API keys from appearing in process listings (ps aux) and by enforcing restrictive file permissions (0o600) on its configuration files. It avoids shell injection by using subprocess.run with argument lists and contains no evidence of data exfiltration, persistence mechanisms, or malicious prompt injection.
Capability Tags
requires-sensitive-credentials
Capability Assessment
Purpose & Capability
The name/description match the included script: a CLI that posts to an images endpoint and saves base64-encoded images. However the registry metadata declares no required environment variables while both SKILL.md and the script document/accept environment variables (GPT_IMAGE2_API_KEY, GPT_IMAGE2_BASE_URL and configurable api_key_env). This is a mild metadata inconsistency but not by itself malicious.
Instruction Scope
Runtime instructions and the script are narrowly scoped to image generation/editing via HTTP calls. They reference config files (XDG path, ~/.config, ~/.gpt-image-2-config.json, and legacy ~/.hermes/*) which is expected for config discovery. A notable capability in the instructions: the user (or an integrator) can override base_url to any endpoint, so API keys supplied to the tool may be sent to a non-OpenAI server if a malicious base_url is used.
Install Mechanism
No install spec; the skill ships a self-contained Python script that uses only stdlib + curl. No downloads from arbitrary URLs or package installs are required by the skill itself.
Credentials
The skill accepts an API key via CLI, config file (plaintext), named env var (GPT_IMAGE2_API_KEY) or an arbitrary env var name set in config.api_key_env. Requiring or reading an env var is reasonable for an API client, but the metadata did not declare any required env vars. The ability to set arbitrary base_url + arbitrary api_key_env increases the risk that credentials could be directed to a third-party service or that an unexpected env var name will be read; the script also documents storing plaintext keys in config files (not recommended).
Persistence & Privilege
The skill does not request always:true, does not auto-enable itself, and does not modify other skills. It only suggests copying the script to PATH (user action). No elevated or persistent system privileges are requested.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install gpt-image-2-api
  3. After installation, invoke the skill by name or use /gpt-image-2-api
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v2.0.0
v2.0: agent-agnostic redesign, XDG config paths, config --init/--show commands, zero pip deps, multi-agent integration guide, removed Hermes-specific references
v1.1.0
Initial publish: text-to-image and image-to-image editing via OpenAI gpt-image-2 API, configurable base_url/api_key, security-first config, curl -K auth safety
Metadata
Slug gpt-image-2-api
Version 2.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 2
Frequently Asked Questions

What is GPT Image 2 API?

Generate and edit images via OpenAI gpt-image-2 model. Agent-agnostic CLI — works with any AI agent (Hermes, Claude Code, Codex, OpenClaw, etc.). Supports co... It is an AI Agent Skill for Claude Code / OpenClaw, with 84 downloads so far.

How do I install GPT Image 2 API?

Run "/install gpt-image-2-api" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is GPT Image 2 API free?

Yes, GPT Image 2 API is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does GPT Image 2 API support?

GPT Image 2 API is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created GPT Image 2 API?

It is built and maintained by Cong Pendy (@jancong); the current version is v2.0.0.

💬 Comments