功能描述

Skip the learning curve of professional editing software. Describe what you want — convert this image to video and benchmark quality against other AI tools —...

使用说明 (SKILL.md)

Getting Started

Ready when you are. Drop your still images here or describe what you want to make.

Try saying:

"convert a single product photo or landscape image into a 1080p MP4"
"convert this image to video and benchmark quality against other AI tools"
"comparing AI-generated video quality from static images for AI researchers, content creators, marketers"

Quick Start Setup

This skill connects to a cloud processing backend. On first use, set up the connection automatically and let the user know ("Connecting...").

Token check: Look for NEMO_TOKEN in the environment. If found, skip to session creation. Otherwise:

Generate a UUID as client identifier
POST https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with X-Client-Id header
Extract data.token from the response — this is your NEMO_TOKEN (100 free credits, 7-day expiry)

Session: POST https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer auth and body {"task_name":"project"}. Keep the returned session_id for all operations.

Let the user know with a brief "Ready!" when setup is complete. Don't expose tokens or raw API output.

AI Image to Video Benchmark — Compare image to video results

Name: Ai Image To Video Benchmark
Author: vynbosserman65

This tool takes your still images and runs AI video generation through a cloud rendering pipeline. You upload, describe what you want, and download the result.

Say you have a single product photo or landscape image and want to convert this image to video and benchmark quality against other AI tools — the backend processes it in about 30-90 seconds and hands you a 1080p MP4.

Tip: high-contrast images with clear subjects tend to produce the most consistent motion results across benchmark tests.

Matching Input to Actions

User prompts referencing ai image to video benchmark, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says...	Action	Skip SSE?
"export" / "导出" / "download" / "send me the video"	→ §3.5 Export	✅
"credits" / "积分" / "balance" / "余额"	→ §3.3 Credits	✅
"status" / "状态" / "show tracks"	→ §3.4 State	✅
"upload" / "上传" / user sends file	→ §3.2 Upload	✅
Everything else (generate, edit, add BGM…)	→ §3.1 SSE	❌

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

Skill attribution — read from this file's YAML frontmatter at runtime:

X-Skill-Source: ai-image-to-video-benchmark
X-Skill-Version: from frontmatter version
X-Skill-Platform: detect from install path (~/.clawhub/ → clawhub, ~/.cursor/skills/ → cursor, else unknown)

Every API call needs Authorization: Bearer \x3CNEMO_TOKEN> plus the three attribution headers above. If any header is missing, exports return 402.

API base: https://mega-api-prod.nemovideo.ai

Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"\x3Clang>"} — returns task_id, session_id.

Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"\x3Csid>","new_message":{"parts":[{"text":"\x3Cmsg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.

Upload: POST /api/upload-video/nemo_agent/me/\x3Csid> — file: multipart -F "files=@/path", or URL: {"urls":["\x3Curl>"],"source_type":"url"}

Credits: GET /api/credits/balance/simple — returns available, frozen, total

Session state: GET /api/state/nemo_agent/me/\x3Csid>/latest — key fields: data.state.draft, data.state.video_infos, data.state.generated_media

Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_\x3Cts>","sessionId":"\x3Csid>","draft":\x3Cjson>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/\x3Cid> every 30s until status = completed. Download URL at output.url.

Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

SSE Event Handling

Event	Action
Text response	Apply GUI translation (§4), present to user
Tool call/result	Process internally, don't forward
`heartbeat` / empty `data:`	Keep waiting. Every 2 min: "⏳ Still working..."
Stream closes	Process final response

~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.

Backend Response Translation

The backend assumes a GUI exists. Translate these into API actions:

Backend says	You do
"click [button]" / "点击"	Execute via API
"open [panel]" / "打开"	Query session state
"drag/drop" / "拖拽"	Send edit via SSE
"preview in timeline"	Show track summary
"Export button" / "导出"	Execute export workflow

Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Error Handling

Code	Meaning	Action
0	Success	Continue
1001	Bad/expired token	Re-auth via anonymous-token (tokens expire after 7 days)
1002	Session not found	New session §3.0
2001	No credits	Anonymous: show registration URL with `?bind=\x3Cid>` (get `\x3Cid>` from create-session or state response when needed). Registered: "Top up credits in your account"
4001	Unsupported file	Show supported formats
4002	File too large	Suggest compress/trim
400	Missing X-Client-Id	Generate Client-Id and retry (see §1)
402	Free plan export blocked	Subscription tier issue, NOT credits. "Register or upgrade your plan to unlock export."
429	Rate limit (1 token/client/7 days)	Retry in 30s once

Common Workflows

Quick edit: Upload → "convert this image to video and benchmark quality against other AI tools" → Download MP4. Takes 30-90 seconds for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "convert this image to video and benchmark quality against other AI tools" — concrete instructions get better results.

Max file size is 200MB. Stick to JPG, PNG, WEBP, HEIC for the smoothest experience.

Export as MP4 with H.264 codec for easy sharing and side-by-side benchmark comparisons.

安全使用建议

This skill appears to do what it claims (upload images to a cloud backend to generate videos), but there are a few things to check before installing or using it: - Clarify the NEMO_TOKEN requirement: the registry marks it required but the SKILL.md can obtain an anonymous token automatically. Ask the author whether you must supply your own token or whether the anonymous token is used by default. - Be aware the skill will send your images and session data to https://mega-api-prod.nemovideo.ai. Do not upload sensitive images or data you wouldn't want sent to a third party. - The skill will read local metadata to set an X-Skill-Platform header (it may inspect install paths and this file's frontmatter). If you are uncomfortable with any filesystem inspection, do not enable the skill or request the author remove that behavior. - Confirm the backend domain (nemovideo.ai) is legitimate for your use case. If you cannot verify the service, avoid supplying long-lived credentials; prefer using ephemeral or anonymous tokens where possible. If you decide to proceed, prefer running it with an ephemeral/limited token and test with non-sensitive images first. If you need higher assurance, ask the skill publisher to resolve the metadata inconsistencies and provide a privacy/security statement describing what data is retained and for how long.

功能分析

Type: OpenClaw Skill Name: ai-image-to-video-benchmark Version: 1.0.0 The ai-image-to-video-benchmark skill is a legitimate integration for an AI video generation service (nemovideo.ai). The SKILL.md file provides clear instructions for the agent to handle authentication via anonymous tokens, manage user sessions, and process media uploads/exports. While the skill performs environment checks to identify the host platform (e.g., checking for ~/.clawhub/ or ~/.cursor/skills/) and communicates with a remote API, these actions are transparently documented and directly support the stated functionality of generating and benchmarking AI videos without any evidence of malicious intent or data exfiltration.

能力评估

⚠ Purpose & Capability

The skill claims to require a single credential (NEMO_TOKEN) which matches the described cloud API usage, but the SKILL.md also describes an automatic anonymous-token flow if NEMO_TOKEN is absent. Registry metadata listed no config paths while the SKILL.md frontmatter declares ~/.config/nemovideo/ as a required config path. These inconsistencies (required env var vs optional self-provisioning, differing config-path declarations) are incoherent and should be clarified by the author.

ℹ Instruction Scope

Instructions are focused on uploading images, creating sessions, SSE handling, and downloads — all coherent for an image→video service. However the skill instructs the agent to read this file's YAML frontmatter and to detect the agent's install path (~/.clawhub, ~/.cursor/skills/) to set X-Skill-Platform header. That requires inspecting local filesystem/metadata which isn't declared in the registry. The SKILL.md also instructs the agent to avoid exposing tokens but still requires sending Authorization headers to the third-party API. No other unrelated data collection is specified.

✓ Install Mechanism

Instruction-only skill with no install spec and no code files — lowest install risk. Nothing will be automatically downloaded or written to disk by an installer step described here.

⚠ Credentials

Only NEMO_TOKEN is listed as required, which is proportional to the described API usage. But the registry declaring NEMO_TOKEN as required conflicts with SKILL.md's explicit anonymous-token fallback (it will POST to /api/auth/anonymous-token and use returned token). This mismatch could be benign (author allowed both) or misleading (declaring a required secret when one is not strictly needed). The skill will transmit the token (Authorization: Bearer) to an external domain — users should treat that token as sensitive.

✓ Persistence & Privilege

Skill is not always: true and does not request persistent system-wide privileges. It does create and hold session tokens for cloud jobs (normal for a cloud-render skill) but there's no instruction to modify other skills or system settings.

版本历史

v1.0.0

AI Image to Video Benchmark skill — Initial release: - Instantly converts still images to video clips using AI, benchmarking results against other tools. - Supports JPG, PNG, WEBP, HEIC formats up to 200MB. - Automated session and token setup; 100 free credits for new users. - Streamlined cloud processing with easy download of 1080p MP4 videos. - Simple workflows for uploading, editing, and exporting videos for researchers, creators, and marketers.

元数据

Slug ai-image-to-video-benchmark

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

Ai Image To Video Benchmark 是什么？

Skip the learning curve of professional editing software. Describe what you want — convert this image to video and benchmark quality against other AI tools —... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 68 次。

如何安装 Ai Image To Video Benchmark？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install ai-image-to-video-benchmark」即可一键安装，无需额外配置。

Ai Image To Video Benchmark 是免费的吗？

是的，Ai Image To Video Benchmark 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

Ai Image To Video Benchmark 支持哪些平台？

Ai Image To Video Benchmark 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Ai Image To Video Benchmark？

由 vynbosserman65（@vynbosserman65）开发并维护，当前版本 v1.0.0。

Ai Image To Video Benchmark