← Back to Skills Marketplace
dsewell-583h0

Image To Video Create

by dsewell-583h0 · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
99
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install image-to-video-create
Description
Turn three product photos in JPG format into 1080p animated video clips just by typing what you need. Whether it's turning still photos into shareable video...
README (SKILL.md)

Getting Started

Share your images and I'll get started on AI video creation. Or just tell me what you're thinking.

Try saying:

  • "convert my images"
  • "export 1080p MP4"
  • "turn these photos into a 30-second"

Quick Start Setup

This skill connects to a cloud processing backend. On first use, set up the connection automatically and let the user know ("Connecting...").

Token check: Look for NEMO_TOKEN in the environment. If found, skip to session creation. Otherwise:

  • Generate a UUID as client identifier
  • POST https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with X-Client-Id header
  • Extract data.token from the response — this is your NEMO_TOKEN (100 free credits, 7-day expiry)

Session: POST https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer auth and body {"task_name":"project"}. Keep the returned session_id for all operations.

Let the user know with a brief "Ready!" when setup is complete. Don't expose tokens or raw API output.

Image to Video Create — Convert Images into Video Clips

This tool takes your images and runs AI video creation through a cloud rendering pipeline. You upload, describe what you want, and download the result.

Say you have three product photos in JPG format and want to turn these photos into a 30-second video with transitions and background music — the backend processes it in about 30-60 seconds and hands you a 1080p MP4.

Tip: using images with consistent aspect ratios produces smoother transitions.

Matching Input to Actions

User prompts referencing image to video create, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says... Action Skip SSE?
"export" / "导出" / "download" / "send me the video" → §3.5 Export
"credits" / "积分" / "balance" / "余额" → §3.3 Credits
"status" / "状态" / "show tracks" → §3.4 State
"upload" / "上传" / user sends file → §3.2 Upload
Everything else (generate, edit, add BGM…) → §3.1 SSE

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

Skill attribution — read from this file's YAML frontmatter at runtime:

  • X-Skill-Source: image-to-video-create
  • X-Skill-Version: from frontmatter version
  • X-Skill-Platform: detect from install path (~/.clawhub/clawhub, ~/.cursor/skills/cursor, else unknown)

Every API call needs Authorization: Bearer \x3CNEMO_TOKEN> plus the three attribution headers above. If any header is missing, exports return 402.

API base: https://mega-api-prod.nemovideo.ai

Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"\x3Clang>"} — returns task_id, session_id.

Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"\x3Csid>","new_message":{"parts":[{"text":"\x3Cmsg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.

Upload: POST /api/upload-video/nemo_agent/me/\x3Csid> — file: multipart -F "files=@/path", or URL: {"urls":["\x3Curl>"],"source_type":"url"}

Credits: GET /api/credits/balance/simple — returns available, frozen, total

Session state: GET /api/state/nemo_agent/me/\x3Csid>/latest — key fields: data.state.draft, data.state.video_infos, data.state.generated_media

Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_\x3Cts>","sessionId":"\x3Csid>","draft":\x3Cjson>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/\x3Cid> every 30s until status = completed. Download URL at output.url.

Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

SSE Event Handling

Event Action
Text response Apply GUI translation (§4), present to user
Tool call/result Process internally, don't forward
heartbeat / empty data: Keep waiting. Every 2 min: "⏳ Still working..."
Stream closes Process final response

~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.

Translating GUI Instructions

The backend responds as if there's a visual interface. Map its instructions to API calls:

  • "click" or "点击" → execute the action via the relevant endpoint
  • "open" or "打开" → query session state to get the data
  • "drag/drop" or "拖拽" → send the edit command through SSE
  • "preview in timeline" → show a text summary of current tracks
  • "Export" or "导出" → run the export workflow

Draft JSON uses short keys: t for tracks, tt for track type (0=video, 1=audio, 7=text), sg for segments, d for duration in ms, m for metadata.

Example timeline summary:

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Error Handling

Code Meaning Action
0 Success Continue
1001 Bad/expired token Re-auth via anonymous-token (tokens expire after 7 days)
1002 Session not found New session §3.0
2001 No credits Anonymous: show registration URL with ?bind=\x3Cid> (get \x3Cid> from create-session or state response when needed). Registered: "Top up credits in your account"
4001 Unsupported file Show supported formats
4002 File too large Suggest compress/trim
400 Missing X-Client-Id Generate Client-Id and retry (see §1)
402 Free plan export blocked Subscription tier issue, NOT credits. "Register or upgrade your plan to unlock export."
429 Rate limit (1 token/client/7 days) Retry in 30s once

Common Workflows

Quick edit: Upload → "turn these photos into a 30-second video with transitions and background music" → Download MP4. Takes 30-60 seconds for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "turn these photos into a 30-second video with transitions and background music" — concrete instructions get better results.

Max file size is 200MB. Stick to JPG, PNG, WEBP, HEIC for the smoothest experience.

Export as MP4 for widest compatibility across social platforms.

Usage Guidance
This skill will upload images and call a third‑party service (mega-api-prod.nemovideo.ai). If you install it, expect your photos to be transmitted off your device and processed on that service. The skill can create an anonymous token if you don't provide NEMO_TOKEN and may read a local config path (~/.config/nemovideo/) and detect install folders — if you have sensitive images or private config files, do not use it until you verify the service's privacy/security posture. Prefer supplying your own NEMO_TOKEN from a trusted account if you proceed, review network requests/logging policies, and avoid uploading personally identifiable or regulated data. Because the source and homepage are unknown, treat this as unverified third‑party functionality.
Capability Analysis
Type: OpenClaw Skill Name: image-to-video-create Version: 1.0.0 The skill is a legitimate integration for the 'nemovideo.ai' service, allowing users to convert images to video via a cloud API. It provides clear instructions for the AI agent to handle authentication (including an anonymous token flow), session management, and file uploads to 'mega-api-prod.nemovideo.ai'. The skill includes security-conscious instructions to avoid exposing tokens to the user and follows standard patterns for media processing services without any signs of malicious intent, data exfiltration, or unauthorized execution.
Capability Assessment
Purpose & Capability
Name and description match the declared requirement (NEMO_TOKEN) and the SKILL.md describes APIs for uploading images, creating sessions, and exporting video — these are expected for a cloud-based image→video service. The skill source and homepage are unknown, though, which reduces transparency.
Instruction Scope
Instructions stay largely within the stated purpose (upload images, create session, SSE for edits, export results). However, the skill tells the agent to: (1) auto-request an anonymous token if NEMO_TOKEN is missing, (2) read YAML frontmatter at runtime, and (3) detect install paths and a config directory (~/.config/nemovideo/ and ~/.clawhub/ or ~/.cursor/skills/) — these steps involve filesystem checks and outbound network calls to a third party and should be considered when deciding what data will be sent off-device.
Install Mechanism
Instruction-only (no install spec, no code files). This minimizes disk writes and installation-time risk; the runtime behavior is entirely in the SKILL.md instructions and network calls.
Credentials
Only one env var is declared (NEMO_TOKEN), which is appropriate for a cloud API. But the skill will auto-provision an anonymous token if none is present and metadata references a config path (~/.config/nemovideo/) — the latter suggests the skill may try to read local config files beyond just checking for NEMO_TOKEN. No unrelated credentials are requested.
Persistence & Privilege
always:false and no install steps. The skill asks agents to keep session_id for operations (expected). It does not request permanent platform privileges or attempt to modify other skills.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install image-to-video-create
  3. After installation, invoke the skill by name or use /image-to-video-create
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release of Image to Video Create. - Upload three product photos and generate 1080p animated video clips in under a minute. - Automatic cloud setup with easy token handling—first-time users are guided through connection. - Simple workflow: upload images, describe the video you want, and download the final MP4. - No timeline editing or export settings required; everything is processed via prompts. - Supports preview, text overlays, background music, and aspect ratio adjustments. - Built-in error handling for tokens, sessions, file formats, export limits, and rate limiting.
Metadata
Slug image-to-video-create
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Image To Video Create?

Turn three product photos in JPG format into 1080p animated video clips just by typing what you need. Whether it's turning still photos into shareable video... It is an AI Agent Skill for Claude Code / OpenClaw, with 99 downloads so far.

How do I install Image To Video Create?

Run "/install image-to-video-create" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Image To Video Create free?

Yes, Image To Video Create is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Image To Video Create support?

Image To Video Create is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Image To Video Create?

It is built and maintained by dsewell-583h0 (@dsewell-583h0); the current version is v1.0.0.

💬 Comments