Description

Turn a 3-minute raw recording with multiple takes into 1080p compiled highlight clips just by typing what you need. Whether it's automatically selecting the...

README (SKILL.md)

Getting Started

Got raw video footage to work with? Send it over and tell me what you need — I'll take care of the AI video selection.

Try saying:

"pick a 3-minute raw recording with multiple takes into a 1080p MP4"
"pick the best clips and compile them into a highlight video"
"automatically selecting the best moments from raw footage for content creators"

Quick Start Setup

This skill connects to a cloud processing backend. On first use, set up the connection automatically and let the user know ("Connecting...").

Token check: Look for NEMO_TOKEN in the environment. If found, skip to session creation. Otherwise:

Generate a UUID as client identifier
POST https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with X-Client-Id header
Extract data.token from the response — this is your NEMO_TOKEN (100 free credits, 7-day expiry)

Session: POST https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer auth and body {"task_name":"project"}. Keep the returned session_id for all operations.

Let the user know with a brief "Ready!" when setup is complete. Don't expose tokens or raw API output.

VPick AI Video — Pick and Compile Best Clips

Name: Vpick Ai Video
Author: mhogan2013-9

This tool takes your raw video footage and runs AI video selection through a cloud rendering pipeline. You upload, describe what you want, and download the result.

Say you have a 3-minute raw recording with multiple takes and want to pick the best clips and compile them into a highlight video — the backend processes it in about 30-60 seconds and hands you a 1080p MP4.

Tip: shorter source videos yield faster and more accurate clip selection.

Matching Input to Actions

User prompts referencing vpick ai video, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says...	Action	Skip SSE?
"export" / "导出" / "download" / "send me the video"	→ §3.5 Export	✅
"credits" / "积分" / "balance" / "余额"	→ §3.3 Credits	✅
"status" / "状态" / "show tracks"	→ §3.4 State	✅
"upload" / "上传" / user sends file	→ §3.2 Upload	✅
Everything else (generate, edit, add BGM…)	→ §3.1 SSE	❌

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

All calls go to https://mega-api-prod.nemovideo.ai. The main endpoints:

Session — POST /api/tasks/me/with-session/nemo_agent with {"task_name":"project","language":"\x3Clang>"}. Gives you a session_id.
Chat (SSE) — POST /run_sse with session_id and your message in new_message.parts[0].text. Set Accept: text/event-stream. Up to 15 min.
Upload — POST /api/upload-video/nemo_agent/me/\x3Csid> — multipart file or JSON with URLs.
Credits — GET /api/credits/balance/simple — returns available, frozen, total.
State — GET /api/state/nemo_agent/me/\x3Csid>/latest — current draft and media info.
Export — POST /api/render/proxy/lambda with render ID and draft JSON. Poll GET /api/render/proxy/lambda/\x3Cid> every 30s for completed status and download URL.

Formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

Headers are derived from this file's YAML frontmatter. X-Skill-Source is vpick-ai-video, X-Skill-Version comes from the version field, and X-Skill-Platform is detected from the install path (~/.clawhub/ = clawhub, ~/.cursor/skills/ = cursor, otherwise unknown).

Include Authorization: Bearer \x3CNEMO_TOKEN> and all attribution headers on every request — omitting them triggers a 402 on export.

Draft JSON uses short keys: t for tracks, tt for track type (0=video, 1=audio, 7=text), sg for segments, d for duration in ms, m for metadata.

Example timeline summary:

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Translating GUI Instructions

The backend responds as if there's a visual interface. Map its instructions to API calls:

"click" or "点击" → execute the action via the relevant endpoint
"open" or "打开" → query session state to get the data
"drag/drop" or "拖拽" → send the edit command through SSE
"preview in timeline" → show a text summary of current tracks
"Export" or "导出" → run the export workflow

SSE Event Handling

Event	Action
Text response	Apply GUI translation (§4), present to user
Tool call/result	Process internally, don't forward
`heartbeat` / empty `data:`	Keep waiting. Every 2 min: "⏳ Still working..."
Stream closes	Process final response

~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.

Error Handling

Code	Meaning	Action
0	Success	Continue
1001	Bad/expired token	Re-auth via anonymous-token (tokens expire after 7 days)
1002	Session not found	New session §3.0
2001	No credits	Anonymous: show registration URL with `?bind=\x3Cid>` (get `\x3Cid>` from create-session or state response when needed). Registered: "Top up credits in your account"
4001	Unsupported file	Show supported formats
4002	File too large	Suggest compress/trim
400	Missing X-Client-Id	Generate Client-Id and retry (see §1)
402	Free plan export blocked	Subscription tier issue, NOT credits. "Register or upgrade your plan to unlock export."
429	Rate limit (1 token/client/7 days)	Retry in 30s once

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "pick the best clips and compile them into a highlight video" — concrete instructions get better results.

Max file size is 500MB. Stick to MP4, MOV, AVI, WebM for the smoothest experience.

Export as MP4 for widest compatibility.

Common Workflows

Quick edit: Upload → "pick the best clips and compile them into a highlight video" → Download MP4. Takes 30-60 seconds for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

Usage Guidance

This skill is reasonable for cloud video editing, but install it only if you are comfortable sending selected video files and prompts to mega-api-prod.nemovideo.ai and using or creating a NEMO_TOKEN. Avoid uploading highly sensitive footage unless you have verified the provider's privacy, retention, and credit/billing behavior.

Capability Analysis

Type: OpenClaw Skill Name: vpick-ai-video Version: 1.0.0 The vpick-ai-video skill is a legitimate integration for the VPick AI video editing service, facilitating automated video selection and compilation via the nemovideo.ai cloud API. The SKILL.md provides clear instructions for the agent to manage authentication (including anonymous token generation), session handling, and file uploads. No indicators of data exfiltration, malicious execution, or harmful prompt injection were found; the behavior is entirely consistent with the stated purpose of a cloud-based video processing tool.

Capability Assessment

ℹ Purpose & Capability

The stated purpose and behavior are coherent: the skill uploads user-provided video to a cloud rendering API and returns edited/exported video. This is expected for the feature, but users should treat uploaded footage as leaving the local environment.

ℹ Instruction Scope

The instructions define token setup, session creation, uploads, SSE editing, export, polling, and download workflows. These are scoped to video processing and are disclosed; no hidden destructive or unrelated actions are shown.

ℹ Install Mechanism

There is no install spec and no code files, reducing local execution risk. However, the registry lists the source as unknown and has no homepage, so provenance for the external service is limited.

ℹ Credentials

Use of NEMO_TOKEN, external network calls, and user-directed media upload is proportionate to a cloud rendering skill, but it is sensitive and should be understood before use.

ℹ Persistence & Privilege

The artifacts describe a 7-day anonymous token and cloud render jobs tied to a session; no local background persistence or privileged system modification is shown.

Version History

v1.0.0

- Initial release of VPick AI Video skill. - Instantly compile the best clips from a 3-minute raw recording into 1080p highlight videos, based on simple text instructions. - Fully cloud-based: automatic connection, session, and token management with 100 free credits (7-day expiry per token). - Supports uploads, timeline editing via chat, preview summaries, and effortless export to MP4 and other formats. - Includes session state monitoring, event-driven operations, and robust error handling for common issues. - Guides users with concise prompts, quick start, and tips for optimal AI video selection results.

Metadata

Slug vpick-ai-video

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Vpick Ai Video?

Turn a 3-minute raw recording with multiple takes into 1080p compiled highlight clips just by typing what you need. Whether it's automatically selecting the... It is an AI Agent Skill for Claude Code / OpenClaw, with 42 downloads so far.

How do I install Vpick Ai Video?

Run "/install vpick-ai-video" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Vpick Ai Video free?

Yes, Vpick Ai Video is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Vpick Ai Video support?

Vpick Ai Video is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Vpick Ai Video?

It is built and maintained by mhogan2013-9 (@mhogan2013-9); the current version is v1.0.0.

More Skills

Vpick Ai Video