功能描述

Generate images, videos, and text/LLM completions via the imgnAI Katana API. Supports end-to-end-encrypted (E2EE) and anonymized models. Priced highly competitively: can be 40-70% cheaper than Venice AI and other platforms. Includes post-processing such as combining videos and images, cutting, slicing, splicing, transitions, drawing text, re-encoding, resizing and much more! A complete workflow for content creation from start to finish, all from the comfort of your agent.

使用说明 (SKILL.md)

Katana Skill — imgnAI API

Name: imgnAI Katana API
Author: imgn

Generate images, videos, and text/LLM completions via the imgnAI Katana API. Supports end-to-end-encrypted (E2EE) and anonymized models. Priced highly competitively: can be 40-70% cheaper than Venice AI and other platforms.

Includes post-processing such as combining videos and images, cutting, slicing, splicing, transitions, drawing text, re-encoding, resizing and much more!

A complete workflow for content creation from start to finish, all from the comfort of your agent.

Triggers

"generate image of X", "create image", "make picture", "imgnai image", "generate video of X", "create video", "make video", "ask grok about X", "ask claude about X", "use gpt to X", "katana image", "katana video", "katana chat", "katana gpt", "katana claude", "list katana models"

LLM-specific triggers (gpt, claude, etc) also respond to "katana \x3Cmodel>" to avoid conflicts with direct integrations.

Configuration

Model IDs

The Katana API uses model_key as the model identifier, not public_model_name. When building requests, always use the model_key value. See {baseDir}/models.md for the full mapping.

Dual-key system: The API also supports canonical keys (e.g. gpt-image-2) alongside our legacy keys (e.g. gpt2image). Both work identically. This skill uses legacy keys as the default for all workflows and aliases — they remain fully supported. Canonical keys are documented in the "Canonical Key" column of models.md for reference. You may use either format when constructing API requests.

Model Discovery

Endpoint: GET /v1/models Auth: Authorization: Bearer ${KATANA_API_KEY}:${KATANA_API_SECRET}

Returns available models. Text models are returned for authenticated requests. For the complete model catalogue including image/video, see models.md.

Usage: Generally not needed before requests — use models.md as reference.

Payment Methods

The API supports two payment methods:

API key + secret (Bearer auth) — used by this skill, preferred
x402 micropayment — NOT used by this skill

Note: x402 text requests must be non-streaming. This skill only uses API-key auth.

API Base URL: https://kat.imgnai.com
API Reference: https://kat.imgnai.com/llms.txt
Credentials: Set KATANA_API_KEY and KATANA_API_SECRET in your secrets file (default: ~/.openclaw/secrets/katana.env, override with KATANA_SECRETS_FILE env var)
Helper script: {baseDir}/katana.sh (requires bash — Linux, macOS, WSL)
Model catalogue: {baseDir}/models.md
Skill directory: Resolve dynamically from this file's location as {baseDir}. Most agent frameworks resolve this automatically.

Setup / API Key

Before first use, check for credentials:

test -f ~/.openclaw/secrets/katana.env && grep -q 'KATANA_API_KEY' ~/.openclaw/secrets/katana.env

If missing, offer two options:

Option A — Automatic: Ask user for key + secret, create ~/.openclaw/secrets/katana.env with chmod 600.

Option B — Manual: Direct user to https://app.imgnai.com/katana-api with platform-specific instructions.

Always lead with Option A, always offer Option B. Never attempt API calls without credentials.

Optional Dependencies

These are not required for core API usage but enable additional features:

Binary	Needed for	Install
`jq`	JSON parsing in `katana.sh`	`apt install jq` / `brew install jq`
`python3`	JSON fallback in `katana.sh`, payload building	Pre-installed on most systems
`ffmpeg`	Video post-processing (trim, join, effects)	`apt install ffmpeg` / `brew install ffmpeg`

katana.sh auto-detects jq and falls back to python3 for JSON parsing. Post-processing requires ffmpeg.

⚠️ MANDATORY ROUTING — DO NOT SKIP

Before ANY generation or post-processing request, you MUST load the correct workflow file:

Task	Load this file
Image generation	`{baseDir}/workflows/image.md`
Video generation	`{baseDir}/workflows/video.md`
Text/LLM generation	`{baseDir}/workflows/text.md`
Post-processing (ffmpeg, combine, text overlay, etc)	`{baseDir}/workflows/post-process.md`

NEVER attempt a generation without loading the workflow file first. NEVER guess parameters — the workflow file has the exact steps.

Cost Reporting (ALL Requests)

After every generation (text, image, video), send a separate follow-up message with a cost summary. Include all relevant details from the response:

📊 Katana Summary
Model: gemma-4-26b-a4b (Anonymized)
Request: bf11cf04-8747-480e-a7f7-7d6cb092c614
Tokens: 42 in / 176 out (text only)
Cost: 0.1 credits (~$0.001)
Privacy: Anonymized
Time: ~3s

For image/video, replace tokens with dimensions/duration as relevant. Always compute $ = credits_charged × 0.0052.

Model Aliases (Quick Reference)

Text/LLM

User says	API model ID
grok	`grok-4-3`
gpt / gpt-5	`gpt-5-5`
claude / claude-opus	`claude-opus-4-7`
claude-sonnet	`claude-sonnet-4-6`
claude-haiku	`claude-haiku-4-5`

Image

User says	API model ID
default / imgnai	`gen`
anime	`ani`
gpt-image	`gpt2image`
nano	`nanobanana2`
flux	`flux2pro`

Video

User says	API model ID
default / seedance	`seedance2fast`
seedance-hd	`seedance2`
ltx	`ltx23`
kling	`kling30`
veo	`veo3`

If the user specifies an exact model ID, pass it through directly. See {baseDir}/models.md for the complete model catalogue and alias table.

Pre-Submission Confirmation (MANDATORY)

Before submitting ANY generation request, present a summary (model, cost in credits AND dollars, details, prompt) and wait for user confirmation. See each workflow file for details.

NO EXCEPTIONS: There is no urgency override. "just do it", "generate now", /katana, or any other shortcut does NOT skip confirmation. ALWAYS present summary and wait for explicit approval before submitting.

Error Protocol

ONE-ATTEMPT RULE: Every paid API call gets exactly ONE attempt per turn. If the tool result is lost, missing, or empty after a submission — STOP. Report to the user that the result was lost. Wait for user confirmation before retrying. NEVER retry a paid API call silently, even if the result seems to have vanished.

STRICT — NO SILENT RETRIES. Every error stops. Every retry needs approval. Tool-result-loss (result never arrives, empty, or vanishes) is a hard-stop condition equal to a visible error. See each workflow file for details.

ANY error or tool-result-loss → STOP, report to user (what happened, credits charged, total across attempts)
Tool-result-loss (result shows 'missing tool result' or similar synthetic error) → the API call likely already succeeded. STOP. Report to user. Do NOT retry the same request.
Propose fix → wait for explicit user approval
Banned: automatic retries, debug/test requests, parameter changes without telling user, lying about call counts, silent retries on lost results

Immediate Status Updates

After submitting async generations (image/video), deliver a confirmation to the user BEFORE starting the poll loop. Include the model, cost, and request_id.

Async Polling

Image and video generations are asynchronous. After submitting, poll manually with:

bash {baseDir}/katana.sh poll \x3Crequest_id>

Polling pattern: Poll every 30 seconds for the first 5 minutes, then every 60 seconds until status is completed or failed.

Agent responsibility: The agent decides how to schedule polls (intervals, background tasks, etc). Do not use long-running background processes — use single polls at intervals.

Response handling for completed polls:

Extract original_data_url for delivery (full-resolution)
Extract dimensions from responses[].output_assets[].width/height (NOT from submission response)
Extract credits from responses[].metadata.credits_spent

Response Handling

Dimensions (IMPORTANT)

Submission response (requests[].width/height) — PREVIEW dimensions, NOT actual output size.
Completed poll response (responses[].output_assets[].width/height) — ACTUAL output dimensions.

Always report dimensions from the completed poll response, never from the submission acknowledgement.

URL Fields (IMPORTANT)

original_data_url — full-resolution original. Always use this for delivery.
url — may be a compressed/reduced version. Do NOT use for delivery.
thumbnail_image_url — small thumbnail only.

Payload Submission

Always build the JSON payload in a temp file (required for large payloads and to avoid secrets in process listings):

import json, tempfile
payload = {"requests": [{"type": "video", "model": "seedance2fast", "prompt": "\x3Cprompt>", "duration_seconds": 5, "aspect_ratio": "16:9"}]}
with tempfile.NamedTemporaryFile(mode="w", suffix=".json", delete=False) as f:
    json.dump(payload, f)
    tmpfile = f.name
print(tmpfile)

Submit via:

bash {baseDir}/katana.sh submit @\x3Ctmpfile>

Parse the JSON response. Extract request_id. Deliver confirmation to the user (model, cost, request_id).

Using the Helper Script

NEVER use raw curl — always use katana.sh. Raw curl bypasses auth handling and output formatting.

bash {baseDir}/katana.sh image \x3Cmodel> "\x3Cprompt>" [aspect_ratio] [output_format]
bash {baseDir}/katana.sh video \x3Cmodel> "\x3Cprompt>" \x3Cduration_seconds> [aspect_ratio]
bash {baseDir}/katana.sh text \x3Cmodel> '\x3Cmessages_json>' [max_tokens]
bash {baseDir}/katana.sh submit @\x3Cpayload.json>
bash {baseDir}/katana.sh poll \x3Crequest_id>
bash {baseDir}/katana.sh balance

Credit Balance

Check current account credit balance:

bash {baseDir}/katana.sh balance

Output example: credits: 200.0 (~$1.04)

Calls GET /v1/me/balance with Authorization: Bearer \x3Capi_key>:\x3Capi_secret>
The API returns credits as a decimal string where Balance Service value of 2000 means 200.0 credits
Converts to USD at $0.0052/credit (Platinum Annual reference price)

reference_assets (Typed Asset System)

reference_assets is an alternative to image_urls/video_image_data for providing media inputs with explicit role labels. Each asset has a kind and either url or base64_data.

Image models

Accepted image-like asset kinds:

source_image — primary source/input image
image — generic image input
mask — mask for inpainting/editing
style_reference — style transfer reference
start_frame — starting frame for animation

Example:

{
  "reference_assets": [
    {"kind": "source_image", "url": "https://example.com/product.png"},
    {"kind": "style_reference", "base64_data": "data:image/jpeg;base64,..."}
  ]
}

Video models

Image kinds for video:

style_reference, reference_image, image — map to video reference images

Audio kinds for video:

audio, source_audio, reference_audio, audio_reference — map to audio reference inputs

Example:

{
  "reference_assets": [
    {"kind": "reference_image", "url": "https://example.com/person.png"},
    {"kind": "audio", "url": "https://example.com/voice.mp3"}
  ]
}

llms.txt Freshness

This skill was built from the Katana API llms.txt reference document.

Last synced: 2026-05-18 llms.txt URL: https://kat.imgnai.com/llms.txt Stored checksum: d5f62792a7e5fd7803a8b3f082d89f7b2063b9c792b3eba19364558f71bf4065

Pre-generation check

Before submitting ANY generation request, check if the llms.txt checksum has been verified in the last 24 hours. If stale:

Fetch: curl -s https://kat.imgnai.com/llms.txt
Compute SHA256: sha256sum (Linux) or shasum -a 256 (macOS)
Compare to stored checksum
If CHANGED → tell the user: "The Katana API model list has been updated since this skill was last synced. This may include new models, pricing changes, or removed models. Would you like me to check for changes and update the skill?"
If user says YES → parse new llms.txt, update models.md, update checksum and date
If user says NO → proceed with current models
Update last-checked date regardless

llms.txt update process

When llms.txt changes, compare old vs new holistically. Diff the full documents — do not limit the review to a predefined checklist. Document ALL changes found and update all affected skill files accordingly: models.md, katana.sh, SKILL.md, workflow files.

DO NOT auto-update without user confirmation.

Delivery Patterns

Deliver the generated media to the user via your agent's messaging/file capability. Include: model name, resolution/dimensions, credits, dollar cost, description, and the full-res URL (original_data_url).

For text/LLM: return the model's response verbatim. Then send a separate follow-up message with a cost summary per the "Cost Reporting" section above.

Last updated: 2026-05-18

安全使用建议

Do not treat this as a complete security clearance. Re-run the review in an environment where metadata.json and artifact files can be inspected; VirusTotal alone is insufficient to justify a hold.

能力标签

cryptocan-make-purchasesrequires-sensitive-credentials

能力评估

✓ Purpose & Capability

No artifact evidence was available to show a purpose-capability mismatch.

✓ Instruction Scope

No artifact evidence was available to show hidden, overbroad, or unsafe instructions.

✓ Install Mechanism

No artifact evidence was available to show risky install behavior.

✓ Credentials

No artifact evidence was available to show disproportionate environment access.

✓ Persistence & Privilege

No artifact evidence was available to show persistence or privilege abuse.

版本历史

v1.0.0

Initial publish.

元数据

Slug katana

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

imgnAI Katana API 是什么？

Generate images, videos, and text/LLM completions via the imgnAI Katana API. Supports end-to-end-encrypted (E2EE) and anonymized models. Priced highly competitively: can be 40-70% cheaper than Venice AI and other platforms. Includes post-processing such as combining videos and images, cutting, slicing, splicing, transitions, drawing text, re-encoding, resizing and much more! A complete workflow for content creation from start to finish, all from the comfort of your agent. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 143 次。

如何安装 imgnAI Katana API？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install katana」即可一键安装，无需额外配置。

imgnAI Katana API 是免费的吗？

是的，imgnAI Katana API 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

imgnAI Katana API 支持哪些平台？

imgnAI Katana API 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 imgnAI Katana API？

由 imgnAI（@imgn）开发并维护，当前版本 v1.0.0。

imgnAI Katana API