Description

Skip the learning curve of professional editing software. Describe what you want — animate this image into a 10-second cinematic video with smooth motion — a...

README (SKILL.md)

Getting Started

Share your still images and I'll get started on deep AI video generation. Or just tell me what you're thinking.

Try saying:

"convert my still images"
"export 1080p MP4"
"animate this image into a 10-second"

Getting Connected

Before handling any user request, establish a connection to the backend API. Show a brief status like "Connecting...".

If NEMO_TOKEN is in the environment, use it directly and create a session. Otherwise, acquire a free starter token:

Generate a UUID as client identifier
POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with the X-Client-Id header
The response includes a token with 100 free credits valid for 7 days — use it as NEMO_TOKEN

Then create a session by POSTing to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer authorization and body {"task_name":"project","language":"en"}. The session_id in the response is needed for all following requests.

Tell the user you're ready. Keep the technical details out of the chat.

AI Image to Video Deep — Convert Images into Animated Videos

Name: Ai Image To Video Deep
Author: susan4731-wilfordf

This tool takes your still images and runs deep AI video generation through a cloud rendering pipeline. You upload, describe what you want, and download the result.

Say you have a single high-resolution product photo and want to animate this image into a 10-second cinematic video with smooth motion — the backend processes it in about 1-3 minutes and hands you a 1080p MP4.

Tip: images with clear subjects and simple backgrounds animate more consistently.

Matching Input to Actions

User prompts referencing ai image to video deep, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says...	Action	Skip SSE?
"export" / "导出" / "download" / "send me the video"	→ §3.5 Export	✅
"credits" / "积分" / "balance" / "余额"	→ §3.3 Credits	✅
"status" / "状态" / "show tracks"	→ §3.4 State	✅
"upload" / "上传" / user sends file	→ §3.2 Upload	✅
Everything else (generate, edit, add BGM…)	→ §3.1 SSE	❌

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

Skill attribution — read from this file's YAML frontmatter at runtime:

X-Skill-Source: ai-image-to-video-deep
X-Skill-Version: from frontmatter version
X-Skill-Platform: detect from install path (~/.clawhub/ → clawhub, ~/.cursor/skills/ → cursor, else unknown)

Every API call needs Authorization: Bearer \x3CNEMO_TOKEN> plus the three attribution headers above. If any header is missing, exports return 402.

API base: https://mega-api-prod.nemovideo.ai

Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"\x3Clang>"} — returns task_id, session_id.

Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"\x3Csid>","new_message":{"parts":[{"text":"\x3Cmsg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.

Upload: POST /api/upload-video/nemo_agent/me/\x3Csid> — file: multipart -F "files=@/path", or URL: {"urls":["\x3Curl>"],"source_type":"url"}

Credits: GET /api/credits/balance/simple — returns available, frozen, total

Session state: GET /api/state/nemo_agent/me/\x3Csid>/latest — key fields: data.state.draft, data.state.video_infos, data.state.generated_media

Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_\x3Cts>","sessionId":"\x3Csid>","draft":\x3Cjson>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/\x3Cid> every 30s until status = completed. Download URL at output.url.

Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

SSE Event Handling

Event	Action
Text response	Apply GUI translation (§4), present to user
Tool call/result	Process internally, don't forward
`heartbeat` / empty `data:`	Keep waiting. Every 2 min: "⏳ Still working..."
Stream closes	Process final response

~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.

Backend Response Translation

The backend assumes a GUI exists. Translate these into API actions:

Backend says	You do
"click [button]" / "点击"	Execute via API
"open [panel]" / "打开"	Query session state
"drag/drop" / "拖拽"	Send edit via SSE
"preview in timeline"	Show track summary
"Export button" / "导出"	Execute export workflow

Draft JSON uses short keys: t for tracks, tt for track type (0=video, 1=audio, 7=text), sg for segments, d for duration in ms, m for metadata.

Example timeline summary:

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Error Handling

Code	Meaning	Action
0	Success	Continue
1001	Bad/expired token	Re-auth via anonymous-token (tokens expire after 7 days)
1002	Session not found	New session §3.0
2001	No credits	Anonymous: show registration URL with `?bind=\x3Cid>` (get `\x3Cid>` from create-session or state response when needed). Registered: "Top up credits in your account"
4001	Unsupported file	Show supported formats
4002	File too large	Suggest compress/trim
400	Missing X-Client-Id	Generate Client-Id and retry (see §1)
402	Free plan export blocked	Subscription tier issue, NOT credits. "Register or upgrade your plan to unlock export."
429	Rate limit (1 token/client/7 days)	Retry in 30s once

Common Workflows

Quick edit: Upload → "animate this image into a 10-second cinematic video with smooth motion" → Download MP4. Takes 1-3 minutes for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "animate this image into a 10-second cinematic video with smooth motion" — concrete instructions get better results.

Max file size is 200MB. Stick to JPG, PNG, WEBP, HEIC for the smoothest experience.

Export as MP4 for widest compatibility.

Usage Guidance

This skill appears to be a wrapper for a hosted NEMO video-generation API and only needs a single API token (NEMO_TOKEN). Before installing, consider: (1) Trust the external domain (mega-api-prod.nemovideo.ai) — uploaded images and generated videos will be sent there; (2) If you supply a real NEMO_TOKEN, the skill will use it for API calls — avoid providing a token with broad or unrelated permissions; (3) The skill will try to auto-acquire an anonymous token if none is present (it will make network calls to mint and use that token) and instructs the agent to keep these technical steps out of the chat — expect reduced transparency; (4) The runtime asks the agent to probe common install paths (~/.clawhub/, ~/.cursor/skills/) to set an attribution header — this is not strictly necessary for the conversion task and involves reading host paths. If you need higher assurance, ask the maintainer for: an explicit privacy/data-retention policy for uploaded images, justification for probing install paths, and confirmation that no additional environment variables or filesystem locations will be accessed. If you do proceed, prefer using a scoped/limited token or the anonymous token and avoid sending sensitive images.

Capability Analysis

Type: OpenClaw Skill Name: ai-image-to-video-deep Version: 1.0.0 The skill provides a functional interface for an image-to-video generation service hosted at nemovideo.ai. It contains detailed instructions for the AI agent to manage authentication (including anonymous token acquisition), session state, and file uploads. While it instructs the agent to perform environment discovery by checking installation paths (e.g., ~/.cursor/skills/) to populate attribution headers, these actions are directly tied to the service's operational requirements and lack any indicators of malicious intent, data exfiltration, or unauthorized command execution.

Capability Assessment

ℹ Purpose & Capability

The declared purpose (convert images to animated video) aligns with the required credential (NEMO_TOKEN) and the API endpoints described. However, the metadata lists a config path (~/.config/nemovideo/) while runtime instructions also instruct the agent to detect its install platform by probing other install paths (~/.clawhub/, ~/.cursor/skills/). That filesystem probing is not justified by the stated conversion purpose and is an inconsistency.

⚠ Instruction Scope

SKILL.md tells the agent to: use a NEMO_TOKEN if present or obtain an anonymous token by POSTing to an external endpoint; create sessions; perform SSE and multipart uploads; and 'keep the technical details out of the chat.' Two concerns: (1) instructions require contacting an external service (expected) but also instruct the agent to read install-path locations on the host (probing ~/~/.clawhub and ~/.cursor/skills) which is outside the declared configPaths and not necessary for core functionality, and (2) the directive to hide technical actions from the user reduces transparency about token acquisition and requests.

✓ Install Mechanism

Instruction-only skill with no install spec and no code files — lowest-risk install surface. No downloads or archive extraction are requested.

ℹ Credentials

Only one environment variable is required (NEMO_TOKEN), which is appropriate for a hosted API. The skill also instructs that if no token exists it will obtain an anonymous token from the service — this is logical for anonymous usage but means the agent will perform network calls to mint/use tokens. No unrelated credentials are requested.

✓ Persistence & Privilege

always:false and no install-time persistence is requested. The skill does not request system-wide changes or access to other skills' configurations.

Version History

v1.0.0

AI Image to Video Deep version 1.0.0 initial release: - Instantly animates uploaded still images into cinematic video clips using deep AI video generation. - Supports JPG, PNG, WEBP, and HEIC files up to 200MB. - Simple onboarding: automatic free token retrieval and seamless cloud session setup. - Export high-quality 1080p MP4 videos; download links delivered within minutes. - Intuitive commands for uploading, editing, adding text/audio, checking credits, and exporting videos. - Robust error handling and tips for best results included.

Metadata

Slug ai-image-to-video-deep

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Ai Image To Video Deep?

Skip the learning curve of professional editing software. Describe what you want — animate this image into a 10-second cinematic video with smooth motion — a... It is an AI Agent Skill for Claude Code / OpenClaw, with 62 downloads so far.

How do I install Ai Image To Video Deep?

Run "/install ai-image-to-video-deep" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Ai Image To Video Deep free?

Yes, Ai Image To Video Deep is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Ai Image To Video Deep support?

Ai Image To Video Deep is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Ai Image To Video Deep?

It is built and maintained by susan4731-wilfordf (@susan4731-wilfordf); the current version is v1.0.0.

More Skills

Ai Image To Video Deep