功能描述

Use a text model such as gpt-5.4 with the image_generation tool over an OpenAI-compatible /v1/responses endpoint, matching the CPA blog example. Do not call...

使用说明 (SKILL.md)

cpa-gpt-image-2

Name: CPA GPT Image 2
Author: shiftshen

Use this skill when image generation should go through the CPA blog pattern: a normal text model invokes the image_generation tool over a compatible /v1/responses endpoint.

What this skill does

sends a request to an OpenAI-compatible /v1/responses endpoint
matches the CPA blog example request shape closely
uses the image_generation tool
defaults to model gpt-5.4
prefers non-streaming mode by default for simpler and more stable parsing
can switch to streaming mode when needed
automatically retries short image rate-limit responses
automatically retries transient tools: [] text fallbacks from the gateway
keeps credentials in environment variables, never in the skill files

Important rule

Do not treat gpt-image-2 as the direct model on gateways that do not expose that model.

Correct pattern:

use a normal model such as gpt-5.4
pass tools: [{"type": "image_generation", "output_format": "png"}]
let the gateway/tool layer decide whether image generation is available

Wrong pattern for this gateway:

directly calling model gpt-image-2 when the provider does not publish that model

Default environment resolution

The script resolves credentials in this order.

Base URL:

IMAGE_GEN_BASE_URL
OTCBOT_BASE_URL
CPA_BASE_URL
OPENAI_BASE_URL
fallback to OpenClaw models.json otcbot provider baseUrl

API key:

IMAGE_GEN_KEY
OTCBOT_API_KEY
CPA_API_KEY
OPENAI_API_KEY
fallback to OpenClaw models.json otcbot provider apiKey

Model default:

IMAGE_GEN_MODEL
OTCBOT_IMAGE_MODEL
CPA_MODEL
fallback to current OpenClaw image/default model
final fallback: gpt-5.4

Optional:

IMAGE_GEN_OUTPUT_FORMAT — default png
CPA_SESSION_ID — session id header value, default test-session
CPA_USER_AGENT — custom user-agent header
CPA_VERSION — request header version, default 0.122.0
CPA_ORIGINATOR — request header originator, default codex_cli_rs

The script calls:

${BASE_URL%/}/v1/responses

Default execution path

Use the bundled script:

python3 skills/cpa-gpt-image-2/scripts/generate_image.py \
  --prompt "画一只可爱的松鼠" \
  --output /tmp/squirrel.png \
  --model gpt-5.4

Recommended env contract:

export IMAGE_GEN_BASE_URL='http://192.168.10.8:8317/v1'
export IMAGE_GEN_KEY='sk-xxxx'
export IMAGE_GEN_MODEL='gpt-5.4'

Override model when needed:

python3 skills/cpa-gpt-image-2/scripts/generate_image.py \
  --prompt "a cinematic fox detective in Bangkok neon rain" \
  --output /tmp/fox.png \
  --model gpt-5.4 \
  --format png

Expected behavior

The script:

reads credentials from env or OpenClaw otcbot defaults
POSTs to /v1/responses
sends codex-style headers: user-agent, version, originator, session_id
requests the image_generation tool, defaulting to stream: false
parses normal JSON, or SSE data: payloads when streaming is enabled
auto-retries short rate_limit_exceeded image responses when the server provides a retry delay
auto-retries transient gateway fallbacks where the tool list comes back empty and the response degrades to text
extracts the first base64 image from the response
writes the file to the requested output path

Fallback curl patterns

Preferred non-streaming version:

curl --location "$IMAGE_GEN_BASE_URL/responses" \
  --header "Authorization: Bearer $IMAGE_GEN_KEY" \
  --header "Content-Type: application/json" \
  --data '{
    "model": "gpt-5.4",
    "input": "画一只可爱的松鼠",
    "tools": [
      {
        "type": "image_generation",
        "output_format": "png"
      }
    ],
    "instructions": "you are a helpful assistant",
    "tool_choice": "auto",
    "stream": false,
    "store": false
  }'

Streaming version when needed:

curl --location "$IMAGE_GEN_BASE_URL/responses" \
  --header "Authorization: Bearer $IMAGE_GEN_KEY" \
  --header "user-agent: ${CPA_USER_AGENT:-codex-tui/0.122.0 (Manjaro 26.1.0-pre; x86_64) vscode/3.0.12 (codex-tui; 0.122.0)}" \
  --header "version: ${CPA_VERSION:-0.122.0}" \
  --header "originator: ${CPA_ORIGINATOR:-codex_cli_rs}" \
  --header "session_id: ${CPA_SESSION_ID:-test-session}" \
  --header 'accept: text/event-stream' \
  --header 'Content-Type: application/json' \
  --data '{
    "model": "gpt-5.4",
    "input": "画一只可爱的松鼠",
    "tools": [
      {
        "type": "image_generation",
        "output_format": "png"
      }
    ],
    "instructions": "you are a helpful assistant",
    "tool_choice": "auto",
    "stream": true,
    "store": false
  }'

Notes

Prefer the bundled script for repeatability.
Do not hardcode live keys, base URLs, or session ids into workspace docs.
This skill intentionally mirrors the CPA blog example request shape as closely as practical.
On this gateway, prefer gpt-5.4 plus image_generation tool instead of direct gpt-image-2 model calls.
For the known local otcbot endpoint, prefer setting OTCBOT_BASE_URL and OTCBOT_API_KEY explicitly when testing.
If the endpoint returns provider-specific SSE events, extend the SSE parser instead of changing the whole request shape.
If the user asks to send the generated file back in the current chat, use the normal file-delivery flow after generation.

安全使用建议

The skill appears to do what it claims (call a /v1/responses endpoint requesting the image_generation tool and write the returned base64 image to disk). Before installing or running it, confirm these points: 1) supply a trusted IMAGE_GEN_BASE_URL and IMAGE_GEN_KEY (or equivalent env vars) — the script will send those credentials to that endpoint; 2) be aware the script will try to read ~/openclaw/agents/main/agent/models.json and may run 'openclaw models status --json' to infer defaults; if you don't want that, provide explicit IMAGE_GEN_MODEL or IMAGE_GEN_BASE_URL/KEY; 3) the registry metadata does not list required env vars or the openclaw CLI dependency — treat that as an omission and ensure your environment is prepared and that the models.json does not contain sensitive tokens you don't want used; 4) run the script in a controlled environment first (no privileged context), and inspect or run the bundled generate_image.py manually to verify behavior. Providing explicit metadata (declared env vars/binaries) or removing the openclaw file/CLI fallback would increase confidence.

功能分析

Type: OpenClaw Skill Name: cpa-gpt-image-2 Version: 1.0.0 The skill is a specialized utility for generating images via an OpenAI-compatible API, specifically designed to follow a 'CPA blog pattern' using a text model and image tools. The Python script (`generate_image.py`) retrieves credentials from environment variables or the local OpenClaw configuration file (`models.json`) and uses a hardcoded shell command (`openclaw models status`) to infer model defaults. These behaviors are clearly documented in `SKILL.md` and are aligned with the tool's purpose within the OpenClaw ecosystem, with no evidence of data exfiltration, malicious execution, or harmful prompt injection.

能力标签

requires-sensitive-credentials

能力评估

ℹ Purpose & Capability

The SKILL.md description and the included Python script implement the advertised pattern (POST /v1/responses with an image_generation tool, retries, SSE parsing, etc.). The required network access and API keys (IMAGE_GEN_BASE_URL / IMAGE_GEN_KEY or equivalent) are coherent with image-generation functionality. However, the skill also attempts to read OpenClaw provider defaults from a file in the user's home (openclaw/agents/main/agent/models.json) and runs 'openclaw models status --json' to infer a default model. Those file/CLI accesses are not declared in the registry metadata, creating a capability/dependency mismatch.

ℹ Instruction Scope

Runtime instructions stay within the expected domain (construct request, handle SSE/JSON, extract base64 image, write file). The bundled script reads environment variables and a local models.json and invokes the 'openclaw' CLI via os.popen for defaults — actions the SKILL.md documents in places but which broaden scope to filesystem and local CLI access. The skill does not instruct exfiltration beyond posting to the configured BASE_URL, nor does it send secrets to unexpected endpoints.

✓ Install Mechanism

No install spec; this is an instruction-only skill with a bundled script. Nothing is downloaded at install time, so there is no high-risk install mechanism present.

ℹ Credentials

The environment variables the skill uses (BASE_URL/API_KEY variants, model selection vars, and optional header vars) are appropriate for contacting an image-generation endpoint. There are no unrelated credentials requested. That said, the skill reads a local OpenClaw models.json which may contain provider apiKey values — reading that file is a reasonable convenience for defaults but increases sensitivity and should be disclosed/understood by the user.

✓ Persistence & Privilege

The skill does not request persistent 'always' inclusion, does not modify other skills, and writes only the generated image file to the requested output path. It does read an OpenClaw config file but does not persist or alter agent configuration.

版本历史

v1.0.0

Stable default non-stream path, retries for rate limits and tool fallbacks, verified real PNG generation and Telegram delivery.

元数据

Slug cpa-gpt-image-2

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

CPA GPT Image 2 是什么？

Use a text model such as gpt-5.4 with the image_generation tool over an OpenAI-compatible /v1/responses endpoint, matching the CPA blog example. Do not call... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 36 次。

如何安装 CPA GPT Image 2？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install cpa-gpt-image-2」即可一键安装，无需额外配置。

CPA GPT Image 2 是免费的吗？

是的，CPA GPT Image 2 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

CPA GPT Image 2 支持哪些平台？

CPA GPT Image 2 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 CPA GPT Image 2？

由 shiftshen（@shiftshen）开发并维护，当前版本 v1.0.0。

CPA GPT Image 2