Description

Breathe life into static images by turning them into fluid, cinematic video clips in seconds. This image-to-video-ai-generator skill takes your photos, illus...

README (SKILL.md)

Getting Started

Welcome! I turn your still images into dynamic, motion-filled video clips using AI — no editing software needed. Drop an image and tell me the style of animation you want to get started!

Try saying:

"Here's a product photo of my sneakers — can you animate it with a slow 360-degree rotation and a subtle zoom-in effect for an Instagram reel?"
"I have a landscape painting I'd like turned into a video with a cinematic parallax drift and soft cloud movement in the background — about 6 seconds long."
"Take this portrait photo and create a short looping video with a gentle breathing effect and a slow push-in toward the subject's face."

Getting Connected

Before handling any user request, establish a connection to the backend API. Show a brief status like "Connecting...".

If NEMO_TOKEN is in the environment, use it directly and create a session. Otherwise, acquire a free starter token:

Generate a UUID as client identifier
POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with the X-Client-Id header
The response includes a token with 100 free credits valid for 7 days — use it as NEMO_TOKEN

Then create a session by POSTing to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer authorization and body {"task_name":"project","language":"en"}. The session_id in the response is needed for all following requests.

Tell the user you're ready. Keep the technical details out of the chat.

From Still Frame to Moving Story — Instantly

Name: Image To Video Ai Generator
Author: whitejohnk-26

Static images tell a moment. Videos tell a story. This skill bridges that gap by taking any image you provide — a product photo, a portrait, a landscape, an illustration — and generating a video clip that moves, breathes, and engages viewers in ways a flat image simply cannot.

Whether you're building a social media reel, animating a hero image for a landing page, or creating a slideshow that flows like a film, this tool handles the heavy lifting. You describe the motion style you want — a slow zoom, a parallax drift, a dramatic pan — and the AI interprets your image's content to produce natural-looking movement that fits the scene.

This skill is built for speed and creative flexibility. You don't need to storyboard, keyframe, or export from a timeline editor. Just bring your image and your vision, and within moments you'll have a video ready to share, embed, or build upon. It's the fastest path from a single photo to a scroll-stopping video clip.

Motion Request Routing Logic

When you submit a still image for animation, your request is parsed for motion parameters — including frame duration, camera movement style, and interpolation intensity — then dispatched to the appropriate rendering pipeline based on resolution and complexity.

User says...	Action	Skip SSE?
"export" / "导出" / "download" / "send me the video"	→ §3.5 Export	✅
"credits" / "积分" / "balance" / "余额"	→ §3.3 Credits	✅
"status" / "状态" / "show tracks"	→ §3.4 State	✅
"upload" / "上传" / user sends file	→ §3.2 Upload	✅
Everything else (generate, edit, add BGM…)	→ §3.1 SSE	❌

Cloud Rendering API Reference

All image-to-video synthesis runs on distributed GPU clusters via the animation backend, which handles optical flow estimation, temporal frame generation, and video encoding entirely in the cloud. Your source image never needs to leave the session payload — the API accepts base64-encoded frames or direct CDN URLs for seamless diffusion-based motion processing.

Skill attribution — read from this file's YAML frontmatter at runtime:

X-Skill-Source: image-to-video-ai-generator
X-Skill-Version: from frontmatter version
X-Skill-Platform: detect from install path (~/.clawhub/ → clawhub, ~/.cursor/skills/ → cursor, else unknown)

All requests must include: Authorization: Bearer \x3CNEMO_TOKEN>, X-Skill-Source, X-Skill-Version, X-Skill-Platform. Missing attribution headers will cause export to fail with 402.

API base: https://mega-api-prod.nemovideo.ai

Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"\x3Clang>"} — returns task_id, session_id.

Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"\x3Csid>","new_message":{"parts":[{"text":"\x3Cmsg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.

Upload: POST /api/upload-video/nemo_agent/me/\x3Csid> — file: multipart -F "files=@/path", or URL: {"urls":["\x3Curl>"],"source_type":"url"}

Credits: GET /api/credits/balance/simple — returns available, frozen, total

Session state: GET /api/state/nemo_agent/me/\x3Csid>/latest — key fields: data.state.draft, data.state.video_infos, data.state.generated_media

Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_\x3Cts>","sessionId":"\x3Csid>","draft":\x3Cjson>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/\x3Cid> every 30s until status = completed. Download URL at output.url.

Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

SSE Event Handling

Event	Action
Text response	Apply GUI translation (§4), present to user
Tool call/result	Process internally, don't forward
`heartbeat` / empty `data:`	Keep waiting. Every 2 min: "⏳ Still working..."
Stream closes	Process final response

~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.

Backend Response Translation

The backend assumes a GUI exists. Translate these into API actions:

Backend says	You do
"click [button]" / "点击"	Execute via API
"open [panel]" / "打开"	Query session state
"drag/drop" / "拖拽"	Send edit via SSE
"preview in timeline"	Show track summary
"Export button" / "导出"	Execute export workflow

Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Error Handling

Code	Meaning	Action
0	Success	Continue
1001	Bad/expired token	Re-auth via anonymous-token (tokens expire after 7 days)
1002	Session not found	New session §3.0
2001	No credits	Anonymous: show registration URL with `?bind=\x3Cid>` (get `\x3Cid>` from create-session or state response when needed). Registered: "Top up credits in your account"
4001	Unsupported file	Show supported formats
4002	File too large	Suggest compress/trim
400	Missing X-Client-Id	Generate Client-Id and retry (see §1)
402	Free plan export blocked	Subscription tier issue, NOT credits. "Register or upgrade your plan to unlock export."
429	Rate limit (1 token/client/7 days)	Retry in 30s once

Troubleshooting

If the generated video motion looks unnatural or jittery, the most common cause is an image with very low contrast between the subject and background. Try providing an image where the main subject is clearly separated from the background, or specify a simpler motion style like a slow zoom rather than a full parallax.

If the animation ignores part of your image — for example, not moving a sky element you expected to animate — try describing the specific region in your prompt more explicitly. Instead of 'make the clouds move,' try 'animate the upper third of the image with slow drifting cloud motion from left to right.'

For looping videos that feel seamless, request a motion style that returns to its starting position — such as a slow zoom that eases back out, or a drift that reverses gently. Abrupt-ending clips are harder to loop cleanly. If your output has an unexpected color shift or vignette, it may be related to the motion blur style applied — specify 'no vignette' or 'preserve original color grading' in your prompt to override default stylistic choices.

Performance Notes

The quality of the generated video depends significantly on the input image. High-resolution images with clear subjects and well-defined edges produce the most convincing motion — the AI has more detail to work with when simulating depth and movement. Low-resolution or heavily compressed images may result in softer motion or visible artifacts around fine details like hair or foliage.

Complex motion requests on images with busy backgrounds — such as animating a crowd scene or a highly detailed illustration — may take longer to process and can occasionally produce inconsistent results in peripheral areas. For best output, start with a clear focal subject and a relatively uncluttered background.

Video length also affects processing time. Clips under 8 seconds generate quickly and maintain high fidelity. Longer sequences may require breaking the animation into segments for optimal quality. Portrait-oriented images (9:16) and square images (1:1) are natively supported for social media formats.

Use Cases

This image-to-video-ai-generator skill fits naturally into a wide range of creative and professional workflows. E-commerce brands use it to animate product photos into short showcase clips for platforms like TikTok, Instagram Reels, and Pinterest Video Pins — dramatically increasing engagement compared to static listings.

Digital marketers use it to turn hero images and campaign visuals into motion ads without hiring a video production team. A single well-composed brand photo can become a 5-second bumper ad or a looping background video for a landing page.

Artists and illustrators use it to bring their work to life for portfolio showcases, NFT presentations, or social media promotion. A finished illustration animated with a subtle parallax effect feels entirely new without altering the original artwork.

Content creators building slideshows, memorial videos, travel recaps, or educational content use it to stitch animated image clips together into cohesive narrative videos that feel polished and intentional.

Usage Guidance

This skill appears to do what it says (upload an image and call a cloud rendering API), but it will send your images to an external service (mega-api-prod.nemovideo.ai) and can obtain an anonymous token itself if you don't provide NEMO_TOKEN. Before installing or using: 1) Only upload non-sensitive images (avoid IDs, private documents, or medical photos). 2) If you want control, provide your own NEMO_TOKEN rather than relying on the anonymous token. 3) Note a small inconsistency: the SKILL.md frontmatter references a config path (~/.config/nemovideo/) that the registry metadata did not list — benign but worth confirming if you care about local config access. 4) Review the service's privacy/terms and be prepared to revoke the token or stop using the skill if you see unexpected data use. If you want stronger assurance, ask the publisher for a privacy/TOS link or a signed statement about how uploaded media are stored and retained.

Capability Assessment

✓ Purpose & Capability

Name/description (image→video AI) align with required credential (NEMO_TOKEN) and the detailed API calls in SKILL.md. Asking for an API token and upload capability is coherent with the stated purpose.

ℹ Instruction Scope

The SKILL.md instructs the agent to use NEMO_TOKEN (or obtain an anonymous token via the service's auth endpoint), create sessions, upload files (multipart or base64), and call SSE endpoints for generation. It also tells the agent to detect install path and read the skill's own YAML frontmatter for attribution headers. Reading files to upload and adding headers is expected for this skill, but the skill will send user images (base64 or file uploads) to an external service — users should consider privacy of uploaded images.

✓ Install Mechanism

Instruction-only skill with no install spec, no code files, and no binaries to install. This has a lower on-disk risk surface.

ℹ Credentials

Only NEMO_TOKEN is declared as required and used as the bearer token for the backend. The SKILL.md will fall back to requesting an anonymous token from the external API if NEMO_TOKEN is absent — this behavior is reasonable for usability but means the skill can autonomously obtain credentials to use the remote service even if you don't supply a token.

✓ Persistence & Privilege

The skill is not always-enabled and does not request system-wide config changes. It does ask to detect install paths to set an X-Skill-Platform header, which requires reading paths but not modifying other skills or system config.

Version History

v1.0.0

Initial release of Image to Video AI Generator. - Instantly animates still images into dynamic, cinematic video clips with customizable motion styles. - Simple workflow — accepts direct uploads or links, lets you describe the animation style (e.g. zoom, rotation, parallax). - No video editing software needed; cloud-based AI handles all motion, transitions, and rendering. - Built-in support for export, credits check, and status tracking. - Error handling for token, session, export, and upload issues. - Supports multiple video and image output formats (mp4, mov, avi, gif, png, etc).

Metadata

Slug image-to-video-ai-generator

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Image To Video Ai Generator?

Breathe life into static images by turning them into fluid, cinematic video clips in seconds. This image-to-video-ai-generator skill takes your photos, illus... It is an AI Agent Skill for Claude Code / OpenClaw, with 101 downloads so far.

How do I install Image To Video Ai Generator?

Run "/install image-to-video-ai-generator" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Image To Video Ai Generator free?

Yes, Image To Video Ai Generator is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Image To Video Ai Generator support?

Image To Video Ai Generator is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Image To Video Ai Generator?

It is built and maintained by whitejohnk-26 (@whitejohnk-26); the current version is v1.0.0.

More Skills

Image To Video Ai Generator