← Back to Skills Marketplace
bibaofeng

Media Gen

by bibaofeng · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ Security Clean
48
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install media-gen-aisa-api
Description
Generate images and videos with AIsa. Supports four image models (Google Gemini 3 Pro Image, Alibaba Wan 2.7 image + image-pro, ByteDance Seedream) and four...
README (SKILL.md)

Media Gen 🎬

Generate images and videos with a single AIsa API key.

This skill covers the AIsa media-generation routes exposed across three image endpoints and one async video endpoint. The bundled client in scripts/media_gen_client.py picks the correct request shape for each supported model, including the schema differences between Wan video variants.

Use when

  • You want one neutral skill for AIsa image and video generation
  • You need to switch between Gemini image, Wan image, Seedream, and Wan video models without rewriting requests
  • You want a simple CLI for creating images, submitting async video jobs, polling task status, and downloading finished video output

Compatibility

Works with any agentskills.io-compatible harness, including:

  • Claude Code and Claude
  • OpenAI Codex
  • Cursor
  • Gemini CLI
  • OpenCode, Goose, OpenClaw, Hermes
  • and other tools that implement the Agent Skills specification

Requires Python 3, a POSIX shell, and AISA_API_KEY from aisa.one.

What you can do

Image — Gemini (base64 inline)

"Generate a cyberpunk-style city nightscape, neon lights, rainy night, cinematic feel"

Image — Wan 2.7 (URL in chat response)

"Generate an ultra-detailed product shot of a red panda, studio lighting, sharp focus"

Image — Seedream (OpenAI-compatible, large format)

"Generate a 2048×2048 magazine cover: neo-noir detective portrait, film grain"

Video — text-to-video (Wan t2v)

"Sweeping establishing shot of a neon cyberpunk skyline at dusk, 5 seconds"

Video — image-to-video (Wan i2v)

"Starting from this reference image, gentle camera push-in with parallax"

Supported models

Image generation — 4 models, 3 endpoints

Model Developer Endpoint Notes
gemini-3-pro-image-preview Google POST /v1/models/{model}:generateContent Images returned as base64 in candidates[].parts[].inline_data
wan2.7-image Alibaba POST /v1/chat/completions Images returned as URL parts in choices[].message.content[] (type=image)
wan2.7-image-pro Alibaba POST /v1/chat/completions Higher fidelity
seedream-4-5-251128 ByteDance POST /v1/images/generations OpenAI-compatible; minimum 3,686,400 pixels

Video generation — 4 Wan variants, 1 endpoint

Model Kind Image field Output SR
wan2.6-t2v text-to-video none 1080
wan2.6-i2v image-to-video input.img_url (string) 720
wan2.7-t2v text-to-video none 720
wan2.7-i2v image-to-video input.media (array) 720

Important: wan2.7-i2v expects the reference image in input.media as an array of URLs, not input.img_url like wan2.6-i2v. The bundled client handles this automatically when you pass --img-url.

Quick start

export AISA_API_KEY="your-key"

# Any image model — the client routes to the right endpoint
python3 scripts/media_gen_client.py image \
  --model gemini-3-pro-image-preview \
  --prompt "A cute red panda, cinematic lighting" \
  --out out.png

python3 scripts/media_gen_client.py image \
  --model wan2.7-image-pro \
  --prompt "Ultra-detailed product shot of a red panda" \
  --out out.png

python3 scripts/media_gen_client.py image \
  --model seedream-4-5-251128 \
  --prompt "Neo-noir detective portrait, film grain" \
  --size 2048x2048 \
  --out out.png

# Video — text-to-video
python3 scripts/media_gen_client.py video-create \
  --model wan2.7-t2v \
  --prompt "Sweeping shot of a neon cyberpunk skyline"

# Video — image-to-video on wan2.7-i2v
python3 scripts/media_gen_client.py video-create \
  --model wan2.7-i2v \
  --prompt "gentle zoom with parallax" \
  --img-url "https://example.com/reference.jpg" \
  --duration 5

# Wait and download
python3 scripts/media_gen_client.py video-wait \
  --task-id \x3Ctask_id> --download --out out.mp4

Image generation — endpoint reference

Gemini family → POST /v1/models/{model}:generateContent

Documentation: Google Gemini Chat.

curl -X POST "https://api.aisa.one/v1/models/gemini-3-pro-image-preview:generateContent" \
  -H "Authorization: Bearer $AISA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents":[
      {"role":"user","parts":[{"text":"A cute red panda, cinematic lighting"}]}
    ]
  }'

Response contains candidates[].parts[].inline_data with {mime_type, data}, where data is a base64 PNG.

Wan 2.7 family → POST /v1/chat/completions

Documentation: Image Generation via Chat.

Critical rule: messages[].content must be an array of typed parts. A plain string returns HTTP 400 invalid_parameter_error.

curl -X POST "https://api.aisa.one/v1/chat/completions" \
  -H "Authorization: Bearer $AISA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "wan2.7-image",
    "messages": [
      {"role":"user","content":[
        {"type":"text","text":"A cute red panda, ultra-detailed, cinematic lighting"}
      ]}
    ],
    "n": 1
  }'

Images come back as {type: "image", image: "\x3Curl>"} parts inside choices[].message.content[].

Seedream → POST /v1/images/generations

Documentation: OpenAI-Compatible Image Generations.

curl -X POST "https://api.aisa.one/v1/images/generations" \
  -H "Authorization: Bearer $AISA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "seedream-4-5-251128",
    "prompt": "A cute red panda, ultra-detailed, cinematic lighting",
    "n": 1,
    "size": "2048x2048"
  }'

Response: data[].url or data[].b64_json. Upstream enforces a minimum of 3,686,400 pixels. 1024×1024 and 1536×1536 are rejected. Any aspect ratio works as long as width × height ≥ 3,686,400.


Video generation — endpoint reference

Create task → POST /apis/v1/services/aigc/video-generation/video-synthesis

Documentation: Create video generation task. Header X-DashScope-Async: enable is required.

# wan2.6-t2v — text-to-video
curl -X POST "https://api.aisa.one/apis/v1/services/aigc/video-generation/video-synthesis" \
  -H "Authorization: Bearer $AISA_API_KEY" \
  -H "Content-Type: application/json" \
  -H "X-DashScope-Async: enable" \
  -d '{
    "model":"wan2.6-t2v",
    "input":{"prompt":"cinematic close-up, slow push-in"},
    "parameters":{"resolution":"720P","duration":5}
  }'

# wan2.7-i2v — image-to-video (input.media, not input.img_url)
curl -X POST "https://api.aisa.one/apis/v1/services/aigc/video-generation/video-synthesis" \
  -H "Authorization: Bearer $AISA_API_KEY" \
  -H "Content-Type: application/json" \
  -H "X-DashScope-Async: enable" \
  -d '{
    "model":"wan2.7-i2v",
    "input":{
      "prompt":"gentle zoom with parallax",
      "media":["https://example.com/reference.jpg"]
    },
    "parameters":{"resolution":"720P","duration":5}
  }'

Poll task → GET /apis/v1/services/aigc/tasks/{task_id}

Documentation: Get video generation task result.

task_id is a path parameter. The query-string form ?task_id=... returns HTTP 500 unsupported uri.

curl "https://api.aisa.one/apis/v1/services/aigc/tasks/YOUR_TASK_ID" \
  -H "Authorization: Bearer $AISA_API_KEY"

Python client

The bundled client at scripts/media_gen_client.py auto-routes each image model to the correct endpoint and normalizes the response to a saved file.

# Image — model selects the endpoint
python3 scripts/media_gen_client.py image \
  --model \x3Cgemini-3-pro-image-preview | wan2.7-image | wan2.7-image-pro | seedream-4-5-251128> \
  --prompt "..." \
  --out out.png

# Video — create task
python3 scripts/media_gen_client.py video-create \
  --model \x3Cwan2.6-t2v | wan2.6-i2v | wan2.7-t2v | wan2.7-i2v> \
  --prompt "..." \
  [--img-url https://... (required for -i2v models)] \
  [--duration 5|10] \
  [--resolution 720P|1080P]

# Video — poll / wait / download
python3 scripts/media_gen_client.py video-status --task-id \x3Cid>
python3 scripts/media_gen_client.py video-wait --task-id \x3Cid> --poll 10 --timeout 600
python3 scripts/media_gen_client.py video-wait --task-id \x3Cid> --download --out out.mp4

API reference

This skill calls the following AIsa endpoints directly:

See the full AIsa API Reference for the complete catalog.

License

MIT — see LICENSE at the repo root.

Usage Guidance
Install only if you are comfortable sending media-generation prompts and any reference image URLs to AIsa and using an AISA_API_KEY. Store the key securely, monitor usage/quota, and review the full bundled Python script if deploying in a sensitive environment.
Capability Analysis
Type: OpenClaw Skill Name: media-gen-aisa-api Version: 1.0.0 The media-gen skill is a legitimate tool for interacting with the AIsa API (aisa.one) to generate images and videos. The bundled Python script (scripts/media_gen_client.py) uses standard libraries to perform API requests and handle file downloads, with no evidence of data exfiltration, unauthorized execution, or persistence mechanisms. The SKILL.md instructions are well-documented and align strictly with the stated purpose of providing a unified interface for various AI models like Gemini, Wan, and Seedream.
Capability Tags
requires-sensitive-credentials
Capability Assessment
Purpose & Capability
The stated purpose is image/video generation through AIsa, and the visible client code routes supported models to AIsa image and video endpoints.
Instruction Scope
The instructions are user-directed CLI/API examples for generating, polling, and downloading media; these can incur API usage and create output files, but they are aligned with the skill purpose.
Install Mechanism
There is no install script or package-install step; it runs a bundled Python script with python3. The provided SKILL.md and script text are marked truncated, so confidence is limited to the visible artifacts plus the clean static scan.
Credentials
The skill requires AISA_API_KEY, which is expected for AIsa API access but is a sensitive credential that should be stored securely.
Persistence & Privilege
No background persistence or privilege escalation is shown. The client can write/download generated media to user-specified output paths, which is expected for this type of skill.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install media-gen-aisa-api
  3. After installation, invoke the skill by name or use /media-gen-aisa-api
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release of media-gen skill — unified client for AIsa image and video generation. - Supports four image models (Gemini, Wan 2.7, Wan 2.7 Pro, Seedream) and four Wan video variants (2.6/2.7 × t2v/i2v) with a single API key. - Automatically routes each model to the correct endpoint and adapts request schemas. - CLI lets you create images, submit async video jobs, poll for status, and download video output. - Compatible with all agentskills.io harnesses (Claude Code, OpenClaw, Hermes, etc.). - Requires Python 3 and `AISA_API_KEY`.
Metadata
Slug media-gen-aisa-api
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Media Gen?

Generate images and videos with AIsa. Supports four image models (Google Gemini 3 Pro Image, Alibaba Wan 2.7 image + image-pro, ByteDance Seedream) and four... It is an AI Agent Skill for Claude Code / OpenClaw, with 48 downloads so far.

How do I install Media Gen?

Run "/install media-gen-aisa-api" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Media Gen free?

Yes, Media Gen is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Media Gen support?

Media Gen is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Media Gen?

It is built and maintained by bibaofeng (@bibaofeng); the current version is v1.0.0.

💬 Comments