Description

Get captioned video files ready to post, without touching a single slider. Upload your video clips (MP4, MOV, AVI, WebM, up to 500MB), say something like "ad...

README (SKILL.md)

Getting Started

Got video clips to work with? Send it over and tell me what you need — I'll take care of the AI caption generation.

Try saying:

"add a 3-minute talking-head YouTube video into a 1080p MP4"
"add captions in English and Spanish with auto-sync"
"adding subtitles to YouTube and social media videos for YouTubers and social media creators"

First-Time Connection

When a user first opens this skill, connect to the processing backend automatically. Briefly let them know (e.g. "Setting up...").

Authentication: Check if NEMO_TOKEN is set in the environment. If it is, skip to step 2.

Obtain a free token: Generate a random UUID as client identifier. POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with header X-Client-Id set to that UUID. The response data.token is your NEMO_TOKEN — 100 free credits, valid 7 days.
Create a session: POST to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Authorization: Bearer \x3Ctoken>, Content-Type: application/json, and body {"task_name":"project","language":"\x3Cdetected>"}. Store the returned session_id for all subsequent requests.

Keep setup communication brief. Don't display raw API responses or token values to the user.

AI Video Editor for Captions — Auto-Generate and Burn In Captions

Name: Ai Video Editor For Captions
Author: francemichaell-15

Drop your video clips in the chat and tell me what you need. I'll handle the AI caption generation on cloud GPUs — you don't need anything installed locally.

Here's a typical use: you send a a 3-minute talking-head YouTube video, ask for add captions in English and Spanish with auto-sync, and about 30-60 seconds later you've got a MP4 file ready to download. The whole thing runs at 1080p by default.

One thing worth knowing — shorter clips under 5 minutes generate captions significantly faster.

Matching Input to Actions

User prompts referencing ai video editor for captions, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says...	Action	Skip SSE?
"export" / "导出" / "download" / "send me the video"	→ §3.5 Export	✅
"credits" / "积分" / "balance" / "余额"	→ §3.3 Credits	✅
"status" / "状态" / "show tracks"	→ §3.4 State	✅
"upload" / "上传" / user sends file	→ §3.2 Upload	✅
Everything else (generate, edit, add BGM…)	→ §3.1 SSE	❌

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

Base URL: https://mega-api-prod.nemovideo.ai

Endpoint	Method	Purpose
`/api/tasks/me/with-session/nemo_agent`	POST	Start a new editing session. Body: `{"task_name":"project","language":"\x3Clang>"}`. Returns `session_id`.
`/run_sse`	POST	Send a user message. Body includes `app_name`, `session_id`, `new_message`. Stream response with `Accept: text/event-stream`. Timeout: 15 min.
`/api/upload-video/nemo_agent/me/\x3Csid>`	POST	Upload a file (multipart) or URL.
`/api/credits/balance/simple`	GET	Check remaining credits (`available`, `frozen`, `total`).
`/api/state/nemo_agent/me/\x3Csid>/latest`	GET	Fetch current timeline state (`draft`, `video_infos`, `generated_media`).
`/api/render/proxy/lambda`	POST	Start export. Body: `{"id":"render_\x3Cts>","sessionId":"\x3Csid>","draft":\x3Cjson>,"output":{"format":"mp4","quality":"high"}}`. Poll status every 30s.

Accepted file types: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

Skill attribution — read from this file's YAML frontmatter at runtime:

X-Skill-Source: ai-video-editor-for-captions
X-Skill-Version: from frontmatter version
X-Skill-Platform: detect from install path (~/.clawhub/ → clawhub, ~/.cursor/skills/ → cursor, else unknown)

Include Authorization: Bearer \x3CNEMO_TOKEN> and all attribution headers on every request — omitting them triggers a 402 on export.

Error Codes

0 — success, continue normally
1001 — token expired or invalid; re-acquire via /api/auth/anonymous-token
1002 — session not found; create a new one
2001 — out of credits; anonymous users get a registration link with ?bind=\x3Cid>, registered users top up
4001 — unsupported file type; show accepted formats
4002 — file too large; suggest compressing or trimming
400 — missing X-Client-Id; generate one and retry
402 — free plan export blocked; not a credit issue, subscription tier
429 — rate limited; wait 30s and retry once

SSE Event Handling

Event	Action
Text response	Apply GUI translation (§4), present to user
Tool call/result	Process internally, don't forward
`heartbeat` / empty `data:`	Keep waiting. Every 2 min: "⏳ Still working..."
Stream closes	Process final response

~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.

Backend Response Translation

The backend assumes a GUI exists. Translate these into API actions:

Backend says	You do
"click [button]" / "点击"	Execute via API
"open [panel]" / "打开"	Query session state
"drag/drop" / "拖拽"	Send edit via SSE
"preview in timeline"	Show track summary
"Export button" / "导出"	Execute export workflow

Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Common Workflows

Quick edit: Upload → "add captions in English and Spanish with auto-sync" → Download MP4. Takes 30-60 seconds for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "add captions in English and Spanish with auto-sync" — concrete instructions get better results.

Max file size is 500MB. Stick to MP4, MOV, AVI, WebM for the smoothest experience.

Export as MP4 for widest compatibility across all platforms.

Usage Guidance

This skill appears to genuinely call a cloud backend to generate and burn-in captions, which explains the NEMO_TOKEN and the network endpoints. Before installing: 1) Verify you trust the domain (mega-api-prod.nemovideo.ai / nemovideo.ai) and their privacy policy — your videos will be uploaded to that service. 2) Ask the author whether the anonymous token or session_id is persistently written to disk (e.g., ~/.config/nemovideo/) or stored only in-memory; prefer ephemeral tokens if you have privacy concerns. 3) Clarify the registry-vs-SKILL.md mismatch about required config paths. 4) If you need stronger guarantees, avoid providing a long-lived NEMO_TOKEN and prefer using a short-lived anonymous token or an account token you can revoke. If you are uncomfortable with automatic outbound auth/network calls or unknown third-party hosting of your media, do not install.

Capability Analysis

Type: OpenClaw Skill Name: ai-video-editor-for-captions Version: 1.0.0 The skill is a functional wrapper for a cloud-based video captioning service (nemovideo.ai). It provides clear instructions for the AI agent to manage authentication, session handling, and video processing via REST API calls. While it requests a specific environment variable (NEMO_TOKEN) and performs automated registration for anonymous tokens, these actions are transparently documented and aligned with the stated purpose of the tool. There is no evidence of data exfiltration, malicious execution, or harmful prompt injection.

Capability Assessment

ℹ Purpose & Capability

Name/description align with cloud-based captioning and the SKILL.md directs requests to a nemovideo.ai backend, which is expected. However the SKILL.md's frontmatter declares a config path (~/.config/nemovideo/) while the registry metadata above the file listed no required config paths — this mismatch should be clarified.

⚠ Instruction Scope

Instructions routinely send user video and metadata to https://mega-api-prod.nemovideo.ai (upload, SSE, render). That is coherent for a cloud render service, but the skill also instructs the agent to automatically create anonymous tokens and to detect install paths (e.g., ~/.clawhub/, ~/.cursor/skills/) and read this file's frontmatter for attribution headers. Detecting install path implies filesystem checks beyond simply handling an uploaded video; automatic anonymous token generation means the agent will make outbound auth/network calls if NEMO_TOKEN isn't present.

✓ Install Mechanism

No install spec and no code files — instruction-only. This minimizes disk-write/install risk.

ℹ Credentials

Only one environment variable (NEMO_TOKEN) is declared, which is appropriate for an API-backed service. However the SKILL.md behavior (generating/storing anonymous token, storing session_id, and frontmatter reference to a config path) implies the skill may persist tokens or session state to a config location; the registry-level metadata contradicted the SKILL.md on required config paths. Confirm where tokens/sessions are stored and how long-lived they are.

✓ Persistence & Privilege

always:false and normal autonomous invocation. The skill uses session tokens that can orphan cloud render jobs if you close the UI, but it does not request permanent platform privileges.

Version History

v1.0.0

AI Video Editor for Captions — Version 1.0.0 - Initial release: auto-generates and burns in captions to uploaded videos (MP4, MOV, AVI, WebM, up to 500MB). - Supports multi-language captions with auto-sync (e.g. English and Spanish). - Cloud-based pipeline: handles upload, editing, captioning, and 1080p MP4 export without local installation. - Interactive workflow: upload clips, give editing instructions via chat, and download captioned videos within ~30–90 seconds. - Tracks credits and session status automatically; provides clear error handling for authentication, file type, and export issues.

Metadata

Slug ai-video-editor-for-captions

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Ai Video Editor For Captions?

Get captioned video files ready to post, without touching a single slider. Upload your video clips (MP4, MOV, AVI, WebM, up to 500MB), say something like "ad... It is an AI Agent Skill for Claude Code / OpenClaw, with 82 downloads so far.

How do I install Ai Video Editor For Captions?

Run "/install ai-video-editor-for-captions" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Ai Video Editor For Captions free?

Yes, Ai Video Editor For Captions is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Ai Video Editor For Captions support?

Ai Video Editor For Captions is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Ai Video Editor For Captions?

It is built and maintained by francemichaell-15 (@francemichaell-15); the current version is v1.0.0.

More Skills

Ai Video Editor For Captions