/install katana
Katana Skill — imgnAI API
Generate images, videos, and text/LLM completions via the imgnAI Katana API. Supports end-to-end-encrypted (E2EE) and anonymized models. Priced highly competitively: can be 40-70% cheaper than Venice AI and other platforms.
Includes post-processing such as combining videos and images, cutting, slicing, splicing, transitions, drawing text, re-encoding, resizing and much more!
A complete workflow for content creation from start to finish, all from the comfort of your agent.
Triggers
"generate image of X", "create image", "make picture", "imgnai image", "generate video of X", "create video", "make video", "ask grok about X", "ask claude about X", "use gpt to X", "katana image", "katana video", "katana chat", "katana gpt", "katana claude", "list katana models"
LLM-specific triggers (gpt, claude, etc) also respond to "katana \x3Cmodel>" to avoid conflicts with direct integrations.
Configuration
Model IDs
The Katana API uses model_key as the model identifier, not public_model_name. When building requests, always use the model_key value. See {baseDir}/models.md for the full mapping.
Dual-key system: The API also supports canonical keys (e.g. gpt-image-2) alongside our legacy keys (e.g. gpt2image). Both work identically. This skill uses legacy keys as the default for all workflows and aliases — they remain fully supported. Canonical keys are documented in the "Canonical Key" column of models.md for reference. You may use either format when constructing API requests.
Model Discovery
Endpoint: GET /v1/models
Auth: Authorization: Bearer ${KATANA_API_KEY}:${KATANA_API_SECRET}
Returns available models. Text models are returned for authenticated requests. For the complete model catalogue including image/video, see models.md.
Usage: Generally not needed before requests — use models.md as reference.
Payment Methods
The API supports two payment methods:
- API key + secret (Bearer auth) — used by this skill, preferred
- x402 micropayment — NOT used by this skill
Note: x402 text requests must be non-streaming. This skill only uses API-key auth.
- API Base URL:
https://kat.imgnai.com - API Reference: https://kat.imgnai.com/llms.txt
- Credentials: Set
KATANA_API_KEYandKATANA_API_SECRETin your secrets file (default:~/.openclaw/secrets/katana.env, override withKATANA_SECRETS_FILEenv var) - Helper script:
{baseDir}/katana.sh(requires bash — Linux, macOS, WSL) - Model catalogue:
{baseDir}/models.md - Skill directory: Resolve dynamically from this file's location as
{baseDir}. Most agent frameworks resolve this automatically.
Setup / API Key
Before first use, check for credentials:
test -f ~/.openclaw/secrets/katana.env && grep -q 'KATANA_API_KEY' ~/.openclaw/secrets/katana.env
If missing, offer two options:
Option A — Automatic: Ask user for key + secret, create ~/.openclaw/secrets/katana.env with chmod 600.
Option B — Manual: Direct user to https://app.imgnai.com/katana-api with platform-specific instructions.
Always lead with Option A, always offer Option B. Never attempt API calls without credentials.
Optional Dependencies
These are not required for core API usage but enable additional features:
| Binary | Needed for | Install |
|---|---|---|
jq |
JSON parsing in katana.sh |
apt install jq / brew install jq |
python3 |
JSON fallback in katana.sh, payload building |
Pre-installed on most systems |
ffmpeg |
Video post-processing (trim, join, effects) | apt install ffmpeg / brew install ffmpeg |
katana.sh auto-detects jq and falls back to python3 for JSON parsing. Post-processing requires ffmpeg.
⚠️ MANDATORY ROUTING — DO NOT SKIP
Before ANY generation or post-processing request, you MUST load the correct workflow file:
| Task | Load this file |
|---|---|
| Image generation | {baseDir}/workflows/image.md |
| Video generation | {baseDir}/workflows/video.md |
| Text/LLM generation | {baseDir}/workflows/text.md |
| Post-processing (ffmpeg, combine, text overlay, etc) | {baseDir}/workflows/post-process.md |
NEVER attempt a generation without loading the workflow file first. NEVER guess parameters — the workflow file has the exact steps.
Cost Reporting (ALL Requests)
After every generation (text, image, video), send a separate follow-up message with a cost summary. Include all relevant details from the response:
📊 Katana Summary
Model: gemma-4-26b-a4b (Anonymized)
Request: bf11cf04-8747-480e-a7f7-7d6cb092c614
Tokens: 42 in / 176 out (text only)
Cost: 0.1 credits (~$0.001)
Privacy: Anonymized
Time: ~3s
For image/video, replace tokens with dimensions/duration as relevant. Always compute $ = credits_charged × 0.0052.
Model Aliases (Quick Reference)
Text/LLM
| User says | API model ID |
|---|---|
| grok | grok-4-3 |
| gpt / gpt-5 | gpt-5-5 |
| claude / claude-opus | claude-opus-4-7 |
| claude-sonnet | claude-sonnet-4-6 |
| claude-haiku | claude-haiku-4-5 |
Image
| User says | API model ID |
|---|---|
| default / imgnai | gen |
| anime | ani |
| gpt-image | gpt2image |
| nano | nanobanana2 |
| flux | flux2pro |
Video
| User says | API model ID |
|---|---|
| default / seedance | seedance2fast |
| seedance-hd | seedance2 |
| ltx | ltx23 |
| kling | kling30 |
| veo | veo3 |
If the user specifies an exact model ID, pass it through directly. See {baseDir}/models.md for the complete model catalogue and alias table.
Pre-Submission Confirmation (MANDATORY)
Before submitting ANY generation request, present a summary (model, cost in credits AND dollars, details, prompt) and wait for user confirmation. See each workflow file for details.
NO EXCEPTIONS: There is no urgency override. "just do it", "generate now", /katana, or any other shortcut does NOT skip confirmation. ALWAYS present summary and wait for explicit approval before submitting.
Error Protocol
ONE-ATTEMPT RULE: Every paid API call gets exactly ONE attempt per turn. If the tool result is lost, missing, or empty after a submission — STOP. Report to the user that the result was lost. Wait for user confirmation before retrying. NEVER retry a paid API call silently, even if the result seems to have vanished.
STRICT — NO SILENT RETRIES. Every error stops. Every retry needs approval. Tool-result-loss (result never arrives, empty, or vanishes) is a hard-stop condition equal to a visible error. See each workflow file for details.
- ANY error or tool-result-loss → STOP, report to user (what happened, credits charged, total across attempts)
- Tool-result-loss (result shows 'missing tool result' or similar synthetic error) → the API call likely already succeeded. STOP. Report to user. Do NOT retry the same request.
- Propose fix → wait for explicit user approval
- Banned: automatic retries, debug/test requests, parameter changes without telling user, lying about call counts, silent retries on lost results
Immediate Status Updates
After submitting async generations (image/video), deliver a confirmation to the user BEFORE starting the poll loop. Include the model, cost, and request_id.
Async Polling
Image and video generations are asynchronous. After submitting, poll manually with:
bash {baseDir}/katana.sh poll \x3Crequest_id>
Polling pattern: Poll every 30 seconds for the first 5 minutes, then every 60 seconds until status is completed or failed.
Agent responsibility: The agent decides how to schedule polls (intervals, background tasks, etc). Do not use long-running background processes — use single polls at intervals.
Response handling for completed polls:
- Extract
original_data_urlfor delivery (full-resolution) - Extract dimensions from
responses[].output_assets[].width/height(NOT from submission response) - Extract credits from
responses[].metadata.credits_spent
Response Handling
Dimensions (IMPORTANT)
- Submission response (
requests[].width/height) — PREVIEW dimensions, NOT actual output size. - Completed poll response (
responses[].output_assets[].width/height) — ACTUAL output dimensions.
Always report dimensions from the completed poll response, never from the submission acknowledgement.
URL Fields (IMPORTANT)
original_data_url— full-resolution original. Always use this for delivery.url— may be a compressed/reduced version. Do NOT use for delivery.thumbnail_image_url— small thumbnail only.
Payload Submission
Always build the JSON payload in a temp file (required for large payloads and to avoid secrets in process listings):
import json, tempfile
payload = {"requests": [{"type": "video", "model": "seedance2fast", "prompt": "\x3Cprompt>", "duration_seconds": 5, "aspect_ratio": "16:9"}]}
with tempfile.NamedTemporaryFile(mode="w", suffix=".json", delete=False) as f:
json.dump(payload, f)
tmpfile = f.name
print(tmpfile)
Submit via:
bash {baseDir}/katana.sh submit @\x3Ctmpfile>
Parse the JSON response. Extract request_id. Deliver confirmation to the user (model, cost, request_id).
Using the Helper Script
NEVER use raw curl — always use katana.sh. Raw curl bypasses auth handling and output formatting.
bash {baseDir}/katana.sh image \x3Cmodel> "\x3Cprompt>" [aspect_ratio] [output_format]
bash {baseDir}/katana.sh video \x3Cmodel> "\x3Cprompt>" \x3Cduration_seconds> [aspect_ratio]
bash {baseDir}/katana.sh text \x3Cmodel> '\x3Cmessages_json>' [max_tokens]
bash {baseDir}/katana.sh submit @\x3Cpayload.json>
bash {baseDir}/katana.sh poll \x3Crequest_id>
bash {baseDir}/katana.sh balance
Credit Balance
Check current account credit balance:
bash {baseDir}/katana.sh balance
Output example: credits: 200.0 (~$1.04)
- Calls
GET /v1/me/balancewithAuthorization: Bearer \x3Capi_key>:\x3Capi_secret> - The API returns
creditsas a decimal string where Balance Service value of2000means200.0credits - Converts to USD at $0.0052/credit (Platinum Annual reference price)
reference_assets (Typed Asset System)
reference_assets is an alternative to image_urls/video_image_data for providing media inputs with explicit role labels. Each asset has a kind and either url or base64_data.
Image models
Accepted image-like asset kinds:
source_image— primary source/input imageimage— generic image inputmask— mask for inpainting/editingstyle_reference— style transfer referencestart_frame— starting frame for animation
Example:
{
"reference_assets": [
{"kind": "source_image", "url": "https://example.com/product.png"},
{"kind": "style_reference", "base64_data": "data:image/jpeg;base64,..."}
]
}
Video models
Image kinds for video:
style_reference,reference_image,image— map to video reference images
Audio kinds for video:
audio,source_audio,reference_audio,audio_reference— map to audio reference inputs
Example:
{
"reference_assets": [
{"kind": "reference_image", "url": "https://example.com/person.png"},
{"kind": "audio", "url": "https://example.com/voice.mp3"}
]
}
llms.txt Freshness
This skill was built from the Katana API llms.txt reference document.
Last synced: 2026-05-18
llms.txt URL: https://kat.imgnai.com/llms.txt
Stored checksum: d5f62792a7e5fd7803a8b3f082d89f7b2063b9c792b3eba19364558f71bf4065
Pre-generation check
Before submitting ANY generation request, check if the llms.txt checksum has been verified in the last 24 hours. If stale:
- Fetch:
curl -s https://kat.imgnai.com/llms.txt - Compute SHA256:
sha256sum(Linux) orshasum -a 256(macOS) - Compare to stored checksum
- If CHANGED → tell the user: "The Katana API model list has been updated since this skill was last synced. This may include new models, pricing changes, or removed models. Would you like me to check for changes and update the skill?"
- If user says YES → parse new llms.txt, update models.md, update checksum and date
- If user says NO → proceed with current models
- Update last-checked date regardless
llms.txt update process
When llms.txt changes, compare old vs new holistically. Diff the full documents — do not limit the review to a predefined checklist. Document ALL changes found and update all affected skill files accordingly: models.md, katana.sh, SKILL.md, workflow files.
DO NOT auto-update without user confirmation.
Delivery Patterns
Deliver the generated media to the user via your agent's messaging/file capability. Include: model name, resolution/dimensions, credits, dollar cost, description, and the full-res URL (original_data_url).
For text/LLM: return the model's response verbatim. Then send a separate follow-up message with a cost summary per the "Cost Reporting" section above.
Last updated: 2026-05-18
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install katana - After installation, invoke the skill by name or use
/katana - Provide required inputs per the skill's parameter spec and get structured output
What is imgnAI Katana API?
Generate images, videos, and text/LLM completions via the imgnAI Katana API. Supports end-to-end-encrypted (E2EE) and anonymized models. Priced highly competitively: can be 40-70% cheaper than Venice AI and other platforms. Includes post-processing such as combining videos and images, cutting, slicing, splicing, transitions, drawing text, re-encoding, resizing and much more! A complete workflow for content creation from start to finish, all from the comfort of your agent. It is an AI Agent Skill for Claude Code / OpenClaw, with 143 downloads so far.
How do I install imgnAI Katana API?
Run "/install katana" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is imgnAI Katana API free?
Yes, imgnAI Katana API is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does imgnAI Katana API support?
imgnAI Katana API is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created imgnAI Katana API?
It is built and maintained by imgnAI (@imgn); the current version is v1.0.0.