Description

Generate images and videos using Vivago AI (智小象) platform. Supports text-to-image, image-to-image, image-to-video, and keyframe-to-video generation. Use when...

README (SKILL.md)

Vivago AI Skill

Name: hidream-model-gen
Author: zhy2015

Integration with Vivago AI (智小象) platform for AI-powered image and video generation.

Supported Features

Image Generation

Text to Image (txt2img): Generate images from text descriptions
Image to Image (img2img): Transform existing images based on prompts, including style transfer, image editing, and multi-image fusion

Video Generation

Text to Video (txt2vid): Generate videos from text descriptions
Image to Video (img2vid): Generate videos from static images
Keyframe to Video (keyframe_to_video): Generate transition videos from start and end keyframes
Video Templates (template_to_video): 181 pre-defined video effects
Supports multiple model versions (v3Pro, v3L, kling-video-o1)

Additional Features

Image upload to Vivago storage
Batch generation (up to 4 images)
Multiple aspect ratios (1:1, 4:3, 3:4, 16:9, 9:16)
Automatic retry with polling

Architecture

Core Modules

scripts/
├── vivago_client.py       # Main API client
├── template_manager.py    # Template management
├── config_loader.py       # Configuration loading
├── enums.py              # Type enums (TaskStatus, AspectRatio, etc.)
├── exceptions.py         # Structured exceptions
└── config/               # Modular configuration files

Code Quality

Type Safety: Complete type annotations and enums
Exception Handling: Structured exception hierarchy
CI/CD: GitHub Actions for automated testing
Modular Config: Split configuration files for maintainability

Setup

Prerequisites

Before using this skill, you need to obtain a Vivago.ai API Token:

Step 1: Login to Vivago.ai

Visit https://vivago.ai/ and log in to your account
Check your remaining credits and consider subscribing to a suitable plan if needed

Step 2: Obtain Your Token

After logging in, visit https://vivago.ai/prod-api/user/token
The page will return your API Token (in JWT format)
Copy this Token for configuration

Security Note: The Token is your credential for accessing the API. Please keep it secure and do not share it with others.

Environment Variables

Security Note: For secure deployments and AI Agents, the system requires the token to be passed strictly via the HIDREAM_AUTHORIZATION environment variable.

Export it securely in your current session:

export HIDREAM_AUTHORIZATION="your_vivago_api_token"

Note: STORAGE_AK and STORAGE_SK are deprecated and removed. The image upload uses secure pre-signed URLs provided by the Vivago API.

File Output Configuration

Important: By default, all generated resources (JSON results, downloaded images, and videos) will be output to the assets/ directory within the current working folder. Ensure this directory exists or the system has permission to create it.

Installation

pip install -r requirements.txt

Usage

Python API

from scripts import create_client, VivagoClient
from scripts.enums import AspectRatio, PortName, TaskStatus
from scripts.exceptions import TaskFailedError, TaskTimeoutError

# Create client
client = create_client()

# Text to image
results = client.text_to_image(
    prompt="a beautiful sunset over mountains",
    port=PortName.KLING_IMAGE,  # or PortName.NANO_BANANA
    wh_ratio=AspectRatio.RATIO_16_9,
    batch_size=2
)

# Image to video (using local image)
results = client.image_to_video(
    prompt="camera slowly zooming out",
    image_uuid=client.upload_image("/path/to/image.jpg"),
    port=PortName.V3PRO,
    wh_ratio=AspectRatio.RATIO_16_9,
    duration=5
)

# Keyframe to video (using start and end images)
results = client.keyframe_to_video(
    prompt="smooth transition from start to end",
    start_image_uuid=client.upload_image("/path/to/start.jpg"),
    end_image_uuid=client.upload_image("/path/to/end.jpg"),
    port=PortName.V3PRO,
    wh_ratio=AspectRatio.RATIO_16_9,
    duration=5
)

# Video Templates - use pre-defined effects
results = client.template_to_video(
    image_uuid=client.upload_image("/path/to/image.jpg"),
    template="ghibli",  # See available templates below
    wh_ratio=AspectRatio.RATIO_9_16
)

Error Handling

from scripts.exceptions import (
    TaskFailedError,
    TaskRejectedError,
    TaskTimeoutError,
    InvalidPortError
)

try:
    results = client.image_to_video(...)
except TaskFailedError as e:
    print(f"Task failed: {e.task_id}")
except TaskRejectedError as e:
    print(f"Content rejected: {e.reason}")
except TaskTimeoutError as e:
    print(f"Timeout after {e.timeout_seconds}s")
except InvalidPortError as e:
    print(f"Invalid port: {e.port}, available: {e.available}")

Command Line (Best for AI Agents)

For AI Agents: The easiest way to use this skill is through the provided CLI scripts. They automatically handle API communication, polling, and result parsing. By default, they use HiDream's native models.

Text to Image:

python3 scripts/txt2img.py \
  --prompt "a futuristic city" \
  --wh-ratio 16:9 \
  --batch-size 2 \
  --output ./assets/results.json

Note: This defaults to the hidream-txt2img model.

Text to Video:

python3 scripts/txt2vid.py \
  --prompt "a cybernetic dragon flying over a futuristic city" \
  --wh-ratio 16:9 \
  --duration 5 \
  --output ./assets/video_results.json

Note: This defaults to the v3Pro model.

Image to Video:

python3 scripts/img2video.py \
  --prompt "slow motion falling leaves" \
  --image ./assets/source_image.jpg \
  --duration 5 \
  --output ./assets/video.json

API Reference

Enums

from scripts.enums import (
    TaskStatus,      # PENDING, COMPLETED, PROCESSING, FAILED, REJECTED
    AspectRatio,     # RATIO_1_1, RATIO_4_3, RATIO_16_9, etc.
    PortCategory,    # TEXT_TO_IMAGE, IMAGE_TO_VIDEO, etc.
    PortName         # KLING_IMAGE, V3PRO, NANO_BANANA, etc.
)

Models

Feature	Available Versions	Default
Text to Image	v3L (HiDream), kling-image-o1	v3L (via port `hidream-txt2img`)
Image to Video	v3Pro, v3L, kling-video-o1	v3Pro
Keyframe to Video	v3Pro, v3L	v3Pro

Note for AI Agents: By default, all CLI tools (txt2img.py, txt2vid.py) are pre-configured to use HiDream's native models (hidream-txt2img for images, v3Pro for videos). You don't need to specify the model unless explicitly requested by the user.

Aspect Ratios

1:1 - Square
4:3 - Standard
3:4 - Portrait
16:9 - Widescreen
9:16 - Mobile/Vertical

Task Status Codes

from scripts.enums import TaskStatus

TaskStatus.PENDING     # 0 - Pending
TaskStatus.COMPLETED   # 1 - Completed
TaskStatus.PROCESSING  # 2 - Processing
TaskStatus.FAILED      # 3 - Failed
TaskStatus.REJECTED    # 4 - Rejected (content review)

File Structure

vivago-ai-skill/
├── scripts/
│   ├── __init__.py         # Package exports
│   ├── vivago_client.py    # Core API client
│   ├── template_manager.py # Template management
│   ├── config_loader.py    # Configuration loader
│   ├── enums.py            # Type enums
│   ├── exceptions.py       # Exception classes
│   ├── logging_config.py   # Logging configuration
│   └── config/             # Modular config files
│       ├── base.json
│       ├── text_to_image.json
│       ├── image_to_video.json
│       └── ...
├── tests/
│   ├── conftest.py         # Pytest configuration
│   ├── archive/            # Archived tests
│   └── ...
├── docs/                   # Documentation
├── .github/workflows/      # CI configuration
├── requirements.txt
├── README.md
└── SKILL.md               # This file

Important Notes

Feishu Channel Messaging Guidelines

When sending generated content through Feishu (飞书) channel:

Content Type	Send Method	Example
Images	✅ Direct file upload	Attach image file directly
Videos	❌ Must send as link	`https://media.vivago.ai/{video_uuid}`

⚠️ Critical: Videos CANNOT be sent as file attachments in Feishu. Always construct and send the direct media URL:

https://media.vivago.ai/b1268f08-ac32-4b83-863f-a419797d768e.mp4

Why: Feishu does not support playable video attachments. Sending video files directly will result in delivery failure or unplayable content.

Image Download

Images can be downloaded using the correct URL format:

https://storage.vivago.ai/image/{image_name}.jpg

Example:

from scripts import create_client
import requests

client = create_client()

# Generate image
results = client.text_to_image(prompt="a cute cat")
image_name = results[0].get('image', '')

# Download image
image_url = f"https://storage.vivago.ai/image/{image_name}.jpg"
response = requests.get(image_url)
with open("output.jpg", "wb") as f:
    f.write(response.content)

Sending via Feishu:

# Download and send through Feishu
image_data = requests.get(image_url).content
# Then send image_data as file attachment via Feishu API

Asynchronous Processing

API calls are asynchronous with automatic polling
Images are automatically resized to max 1024px on longest side before upload
Video generation supports 5 or 10 second durations
Batch size for images: 1-4, for videos: 1
All API calls include automatic retry logic

Error Handling

The client handles common errors:

Network timeouts (with retry)
Rate limiting (with exponential backoff)
Invalid parameters (validation before API call)
Task failures (structured exceptions)

Exception Hierarchy

VivagoError (base)
├── VivagoAPIError
├── MissingCredentialError
├── InvalidPortError
├── ImageUploadError
├── TemplateNotFoundError
└── TaskError
    ├── TaskFailedError
    ├── TaskRejectedError
    └── TaskTimeoutError

Video Templates Reference

The following 181 video templates are available via template_to_video():

Quick Categories

Category	Count	Example Templates
Style Transfer	20+	ghibli, 1930s-2000s vintage styles
Harry Potter	4	magic_reveal_ravenclaw, gryffindor, hufflepuff, slytherin
Wings/Fantasy	10+	angel_wings, phoenix_wings, crystal_wings, fire_wings
Superheroes	5+	iron_man, cat_woman, ghost_rider
Dance	10+	apt, dadada, dance, limbo_dance
Effects	15+	ash_out, metallic_liquid, flash_flood
Thanksgiving	10+	turkey_chasing, autumn_feast, gratitude_photo
Comics/Cartoon	8+	gta_star, anime_figure, bring_comics_to_life
Products	8+	glasses_display, music_box, food_product_display
Scenes	20+	romantic_kiss, graduation, starship_chef

Popular Templates

Template ID	Description
`ghibli` / `ghibli2`	Studio Ghibli animation style
`magic_reveal_ravenclaw`	Harry Potter Ravenclaw transformation
`magic_reveal_gryffindor`	Harry Potter Gryffindor transformation
`magic_reveal_hufflepuff`	Harry Potter Hufflepuff transformation
`magic_reveal_slytherin`	Harry Potter Slytherin transformation
`iron_man`	Iron Man armor assembly
`angel_wings` / `phoenix_wings` / `crystal_wings` / `fire_wings`	Wing transformations
`cat_woman`	Cat Woman style
`ghost_rider`	Ghost Rider flaming skull
`joker`	Joker villain style
`mermaid`	Mermaid underwater scene
`snow_white`	Snow White princess
`barbie`	Barbie princess transformation
`me_in_hand`	Miniature figure in hand
`music_box`	Rotating figure on music box
`anime_figure`	Transform into anime figure
`gta_star`	GTA game style transformation
`apt` / `dadada` / `dance`	Dance templates
`ash_out`	Disintegrate into ashes
`eye_of_the_storm`	Thunder god awakening
`metallic_liquid`	Metal mask transformation
`flash_flood`	Water/flood effect
`turkey_chasing` / `turkey_away` / `turkey_giant`	Thanksgiving turkey scenes
`autumn_feast` / `autumn_stroll`	Autumn scenes
`renovation_of_old_photos`	Colorize B&W photos
`graduation`	Graduation ceremony
`glasses` / `glasses_display`	Glasses/eyewear showcase
`bikini` / `sexy_man` / `sexy_pants`	Fashion/beach
`romantic_kiss` / `boyfriends_rose` / `girlfriends_rose`	Romantic scenes
`ai_archaeologist` / `starship_chef` / `cyber_cooker`	Sci-fi characters
`jungle_reign` / `panther_queen` / `roar_of_the_dustlands` / `tiger_snuggle`	Animal companions
`instant_sadness` / `headphone_vibe` / `relax`	Emotion/reaction
`frost_alert`	Cold/freeze effect
`bald_me`	Bald transformation
`boom_hair` / `curl_pop` / `long_hair`	Hair transformations
`muscles`	Muscle transformation
`face_punch` / `gun_point`	Action effects
`static_shot` / `tracking_shot` / `orbit_shot` / `push_in` / `zoom_out` / `handheld_shot`	Camera movements
`earth_zoom_in` / `earth_zoom_out`	Earth zoom effects

View All Templates

from scripts.template_manager import get_template_manager

manager = get_template_manager()
templates = manager.list_templates()

print(f"Total templates: {len(templates)}")
for tid, name in sorted(templates.items()):
    print(f"  {tid}: {name}")

Usage Example

from scripts import create_client

client = create_client()

# Upload image
image_uuid = client.upload_image("/path/to/photo.jpg")

# Apply Ghibli style template
results = client.template_to_video(
    image_uuid=image_uuid,
    template="ghibli",
    wh_ratio="9:16"
)

# Harry Potter transformation
results = client.template_to_video(
    image_uuid=image_uuid,
    template="magic_reveal_ravenclaw",
    wh_ratio="9:16"
)

Changelog

v0.9.0 (2026-03-09)

✅ Code review complete (P0-P3)
✅ Added GitHub Actions CI
✅ Added type safety module (enums.py)
✅ Added structured exceptions (exceptions.py)
✅ Split configuration into modular files
✅ Archived redundant code and tests
✅ Pinned dependency versions

v0.8.2 (2026-03-08)

✅ Template testing: 44 templates, 40 passed (90.9%)
✅ Fixed metallic_liquid naming issue
✅ Marked long_hair as deprecated

v0.8.0 (2026-03-07)

✅ Completed Tier 1-4 testing
✅ Established smart test optimization system

Usage Guidance

This skill appears to do what it says: it uploads images and requests generation from Vivago (vivago.ai) using the HIDREAM_AUTHORIZATION bearer token. Before installing or running it: 1) Only provide a Vivago API token you control and understand (keep it secret). 2) Be aware that any images you pass will be uploaded to Vivago's servers — avoid sending sensitive or private images. 3) Verify you trust the Vivago service and check its terms/privacy (retention & reuse of images). 4) Because the package includes executable Python code, inspect the code if you run it in sensitive environments; consider running it in an isolated environment (container/VM) and review network egress policies. 5) Note minor inconsistencies: the registry lists no install spec even though requirements.txt and source files are included, and some scripts reference an alternate env var (HIDREAM_TOKEN) and deprecated STORAGE_AK/STORAGE_SK — you can ignore those or set HIDREAM_AUTHORIZATION as instructed. 6) If you need higher assurance, ask the publisher for provenance (homepage or repository) — the skill's source/homepage is not provided in the metadata.

Capability Analysis

Type: OpenClaw Skill Name: hidream-model-gen Version: 1.0.5 The skill bundle is a legitimate and well-structured integration for the Vivago AI platform (vivago.ai). It provides a comprehensive API client (vivago_client.py) and several CLI utilities (txt2img.py, txt2vid.py, img2video.py) for generating AI images and videos. The code uses standard security practices, such as requiring authentication tokens via environment variables (HIDREAM_AUTHORIZATION) and performing image processing locally using the Pillow library. No indicators of data exfiltration, malicious execution, or prompt injection were found; all network activity is directed toward official Vivago AI endpoints.

Capability Assessment

✓ Purpose & Capability

Name/description match the code and manifest: the code implements text->image, img->video, keyframe->video, template handling, and uploads to vivago.ai endpoints. The single required env var (HIDREAM_AUTHORIZATION) is exactly the API bearer token the client uses.

✓ Instruction Scope

SKILL.md and the CLI scripts instruct the agent to call the packaged Python scripts, upload local images to Vivago (via pre-signed URLs), poll for results, and save outputs to an assets/ directory. The runtime instructions and code operate within that scope and do not request or read unrelated system secrets or remote endpoints outside vivago.ai and its storage domains.

ℹ Install Mechanism

No registry install spec is declared (skill is instruction-only in registry), but a requirements.txt and full Python source are included in the package. Installation is the normal pip-based flow (pip install -r requirements.txt). No downloads from arbitrary URLs or extract/install steps were found.

ℹ Credentials

The skill requires one environment variable (HIDREAM_AUTHORIZATION) which is used as a Bearer token. Some scripts reference HIDREAM_TOKEN as a fallback and mention deprecated STORAGE_AK/STORAGE_SK — these are not required for normal operation but appear as backward-compatible fallbacks. No unrelated credentials (e.g., AWS keys, GitHub tokens) are requested.

✓ Persistence & Privilege

The skill does not request always: true and does not modify other skills or system-wide agent settings. It writes generated assets to a local assets/ directory (and /tmp for intermediate files) which is expected behavior for a generator tool.

Version History

v1.0.5

Version 1.0.5 - Changed CLI script instructions to use python3 instead of python. - No other user-facing changes.

v1.0.4

Version 1.0.4 - No code or documentation changes detected in this release. - Functionality, dependencies, and documentation remain unchanged from the previous version.

v1.0.3

- Added explicit dependencies on "requests" and "pillow". - Now requires environment variable HIDREAM_AUTHORIZATION for authentication (no .env fallback). - Security and setup documentation updated: strictly use environment variables for secrets; removed .env file mention. - Updated CLI usage examples to use the ./assets directory for output paths. - Minor documentation clarifications and improved security notes regarding credentials.

v1.0.2

No user-facing changes in this release; documentation and configuration remain the same as the previous version.

v1.0.1

- Removed the unused script file `scripts/verify_fix.py`. - SKILL.md explicitly lists the required environment variable `HIDREAM_AUTHORIZATION`. - Updated documentation to clarify that only `HIDREAM_AUTHORIZATION` is needed, and deprecated variables (`STORAGE_AK`, `STORAGE_SK`) are no longer supported. - Added a security note to ensure `.env` does not contain unrelated or sensitive credentials.

v1.0.0

Initial release of hidream-model-gen: Vivago AI image & video generation skill - Enables AI-powered text-to-image, image-to-image (incl. style transfer), text-to-video, image-to-video, and keyframe-to-video generation via Vivago AI. - Supports multiple model versions (v3Pro, v3L, kling-video-o1) and 181 video template effects. - Handles batch generation, multiple aspect ratios, and secure image upload with pre-signed URLs. - Provides Python API, CLI scripts, and robust error handling. - Fully typed codebase with modular config, structured exceptions, and CI/CD integration. - Outputs results to an assets/ folder by default for easy access.

Metadata

Slug hidream-model-gen

Version 1.0.5

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 6

Frequently Asked Questions

What is hidream-model-gen?

Generate images and videos using Vivago AI (智小象) platform. Supports text-to-image, image-to-image, image-to-video, and keyframe-to-video generation. Use when... It is an AI Agent Skill for Claude Code / OpenClaw, with 305 downloads so far.

How do I install hidream-model-gen?

Run "/install hidream-model-gen" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is hidream-model-gen free?

Yes, hidream-model-gen is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does hidream-model-gen support?

hidream-model-gen is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created hidream-model-gen?

It is built and maintained by harry zhu (@zhy2015); the current version is v1.0.5.

More Skills

hidream-model-gen

Vivago AI Skill

Supported Features

Image Generation

Video Generation

Additional Features

Architecture

Core Modules

Code Quality

Setup

Prerequisites

Step 1: Login to Vivago.ai

Step 2: Obtain Your Token

Environment Variables

File Output Configuration

Installation

Usage

Python API

Error Handling

Command Line (Best for AI Agents)

API Reference

Enums

Models

Aspect Ratios

Task Status Codes

File Structure

Important Notes

Feishu Channel Messaging Guidelines

Image Download

Asynchronous Processing

Error Handling

Exception Hierarchy

Video Templates Reference

Quick Categories

Popular Templates

View All Templates

Usage Example

Changelog

v0.9.0 (2026-03-09)

v0.8.2 (2026-03-08)

v0.8.0 (2026-03-07)

What is hidream-model-gen?

How do I install hidream-model-gen?

Is hidream-model-gen free?

Which platforms does hidream-model-gen support?

Who created hidream-model-gen?

💬 Comments