← Back to Skills Marketplace
susan4731-wilfordf

Ai Image To Video Deep

by susan4731-wilfordf · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
62
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install ai-image-to-video-deep
Description
Skip the learning curve of professional editing software. Describe what you want — animate this image into a 10-second cinematic video with smooth motion — a...
README (SKILL.md)

Getting Started

Share your still images and I'll get started on deep AI video generation. Or just tell me what you're thinking.

Try saying:

  • "convert my still images"
  • "export 1080p MP4"
  • "animate this image into a 10-second"

Getting Connected

Before handling any user request, establish a connection to the backend API. Show a brief status like "Connecting...".

If NEMO_TOKEN is in the environment, use it directly and create a session. Otherwise, acquire a free starter token:

  • Generate a UUID as client identifier
  • POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with the X-Client-Id header
  • The response includes a token with 100 free credits valid for 7 days — use it as NEMO_TOKEN

Then create a session by POSTing to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer authorization and body {"task_name":"project","language":"en"}. The session_id in the response is needed for all following requests.

Tell the user you're ready. Keep the technical details out of the chat.

AI Image to Video Deep — Convert Images into Animated Videos

This tool takes your still images and runs deep AI video generation through a cloud rendering pipeline. You upload, describe what you want, and download the result.

Say you have a single high-resolution product photo and want to animate this image into a 10-second cinematic video with smooth motion — the backend processes it in about 1-3 minutes and hands you a 1080p MP4.

Tip: images with clear subjects and simple backgrounds animate more consistently.

Matching Input to Actions

User prompts referencing ai image to video deep, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says... Action Skip SSE?
"export" / "导出" / "download" / "send me the video" → §3.5 Export
"credits" / "积分" / "balance" / "余额" → §3.3 Credits
"status" / "状态" / "show tracks" → §3.4 State
"upload" / "上传" / user sends file → §3.2 Upload
Everything else (generate, edit, add BGM…) → §3.1 SSE

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

Skill attribution — read from this file's YAML frontmatter at runtime:

  • X-Skill-Source: ai-image-to-video-deep
  • X-Skill-Version: from frontmatter version
  • X-Skill-Platform: detect from install path (~/.clawhub/clawhub, ~/.cursor/skills/cursor, else unknown)

Every API call needs Authorization: Bearer \x3CNEMO_TOKEN> plus the three attribution headers above. If any header is missing, exports return 402.

API base: https://mega-api-prod.nemovideo.ai

Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"\x3Clang>"} — returns task_id, session_id.

Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"\x3Csid>","new_message":{"parts":[{"text":"\x3Cmsg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.

Upload: POST /api/upload-video/nemo_agent/me/\x3Csid> — file: multipart -F "files=@/path", or URL: {"urls":["\x3Curl>"],"source_type":"url"}

Credits: GET /api/credits/balance/simple — returns available, frozen, total

Session state: GET /api/state/nemo_agent/me/\x3Csid>/latest — key fields: data.state.draft, data.state.video_infos, data.state.generated_media

Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_\x3Cts>","sessionId":"\x3Csid>","draft":\x3Cjson>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/\x3Cid> every 30s until status = completed. Download URL at output.url.

Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

SSE Event Handling

Event Action
Text response Apply GUI translation (§4), present to user
Tool call/result Process internally, don't forward
heartbeat / empty data: Keep waiting. Every 2 min: "⏳ Still working..."
Stream closes Process final response

~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.

Backend Response Translation

The backend assumes a GUI exists. Translate these into API actions:

Backend says You do
"click [button]" / "点击" Execute via API
"open [panel]" / "打开" Query session state
"drag/drop" / "拖拽" Send edit via SSE
"preview in timeline" Show track summary
"Export button" / "导出" Execute export workflow

Draft JSON uses short keys: t for tracks, tt for track type (0=video, 1=audio, 7=text), sg for segments, d for duration in ms, m for metadata.

Example timeline summary:

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Error Handling

Code Meaning Action
0 Success Continue
1001 Bad/expired token Re-auth via anonymous-token (tokens expire after 7 days)
1002 Session not found New session §3.0
2001 No credits Anonymous: show registration URL with ?bind=\x3Cid> (get \x3Cid> from create-session or state response when needed). Registered: "Top up credits in your account"
4001 Unsupported file Show supported formats
4002 File too large Suggest compress/trim
400 Missing X-Client-Id Generate Client-Id and retry (see §1)
402 Free plan export blocked Subscription tier issue, NOT credits. "Register or upgrade your plan to unlock export."
429 Rate limit (1 token/client/7 days) Retry in 30s once

Common Workflows

Quick edit: Upload → "animate this image into a 10-second cinematic video with smooth motion" → Download MP4. Takes 1-3 minutes for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "animate this image into a 10-second cinematic video with smooth motion" — concrete instructions get better results.

Max file size is 200MB. Stick to JPG, PNG, WEBP, HEIC for the smoothest experience.

Export as MP4 for widest compatibility.

Usage Guidance
This skill appears to be a wrapper for a hosted NEMO video-generation API and only needs a single API token (NEMO_TOKEN). Before installing, consider: (1) Trust the external domain (mega-api-prod.nemovideo.ai) — uploaded images and generated videos will be sent there; (2) If you supply a real NEMO_TOKEN, the skill will use it for API calls — avoid providing a token with broad or unrelated permissions; (3) The skill will try to auto-acquire an anonymous token if none is present (it will make network calls to mint and use that token) and instructs the agent to keep these technical steps out of the chat — expect reduced transparency; (4) The runtime asks the agent to probe common install paths (~/.clawhub/, ~/.cursor/skills/) to set an attribution header — this is not strictly necessary for the conversion task and involves reading host paths. If you need higher assurance, ask the maintainer for: an explicit privacy/data-retention policy for uploaded images, justification for probing install paths, and confirmation that no additional environment variables or filesystem locations will be accessed. If you do proceed, prefer using a scoped/limited token or the anonymous token and avoid sending sensitive images.
Capability Analysis
Type: OpenClaw Skill Name: ai-image-to-video-deep Version: 1.0.0 The skill provides a functional interface for an image-to-video generation service hosted at nemovideo.ai. It contains detailed instructions for the AI agent to manage authentication (including anonymous token acquisition), session state, and file uploads. While it instructs the agent to perform environment discovery by checking installation paths (e.g., ~/.cursor/skills/) to populate attribution headers, these actions are directly tied to the service's operational requirements and lack any indicators of malicious intent, data exfiltration, or unauthorized command execution.
Capability Assessment
Purpose & Capability
The declared purpose (convert images to animated video) aligns with the required credential (NEMO_TOKEN) and the API endpoints described. However, the metadata lists a config path (~/.config/nemovideo/) while runtime instructions also instruct the agent to detect its install platform by probing other install paths (~/.clawhub/, ~/.cursor/skills/). That filesystem probing is not justified by the stated conversion purpose and is an inconsistency.
Instruction Scope
SKILL.md tells the agent to: use a NEMO_TOKEN if present or obtain an anonymous token by POSTing to an external endpoint; create sessions; perform SSE and multipart uploads; and 'keep the technical details out of the chat.' Two concerns: (1) instructions require contacting an external service (expected) but also instruct the agent to read install-path locations on the host (probing ~/~/.clawhub and ~/.cursor/skills) which is outside the declared configPaths and not necessary for core functionality, and (2) the directive to hide technical actions from the user reduces transparency about token acquisition and requests.
Install Mechanism
Instruction-only skill with no install spec and no code files — lowest-risk install surface. No downloads or archive extraction are requested.
Credentials
Only one environment variable is required (NEMO_TOKEN), which is appropriate for a hosted API. The skill also instructs that if no token exists it will obtain an anonymous token from the service — this is logical for anonymous usage but means the agent will perform network calls to mint/use tokens. No unrelated credentials are requested.
Persistence & Privilege
always:false and no install-time persistence is requested. The skill does not request system-wide changes or access to other skills' configurations.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install ai-image-to-video-deep
  3. After installation, invoke the skill by name or use /ai-image-to-video-deep
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
AI Image to Video Deep version 1.0.0 initial release: - Instantly animates uploaded still images into cinematic video clips using deep AI video generation. - Supports JPG, PNG, WEBP, and HEIC files up to 200MB. - Simple onboarding: automatic free token retrieval and seamless cloud session setup. - Export high-quality 1080p MP4 videos; download links delivered within minutes. - Intuitive commands for uploading, editing, adding text/audio, checking credits, and exporting videos. - Robust error handling and tips for best results included.
Metadata
Slug ai-image-to-video-deep
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Ai Image To Video Deep?

Skip the learning curve of professional editing software. Describe what you want — animate this image into a 10-second cinematic video with smooth motion — a... It is an AI Agent Skill for Claude Code / OpenClaw, with 62 downloads so far.

How do I install Ai Image To Video Deep?

Run "/install ai-image-to-video-deep" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Ai Image To Video Deep free?

Yes, Ai Image To Video Deep is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Ai Image To Video Deep support?

Ai Image To Video Deep is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Ai Image To Video Deep?

It is built and maintained by susan4731-wilfordf (@susan4731-wilfordf); the current version is v1.0.0.

💬 Comments