← Back to Skills Marketplace
francemichaell-15

Ai Video Editor For Captions

by francemichaell-15 · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
82
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install ai-video-editor-for-captions
Description
Get captioned video files ready to post, without touching a single slider. Upload your video clips (MP4, MOV, AVI, WebM, up to 500MB), say something like "ad...
README (SKILL.md)

Getting Started

Got video clips to work with? Send it over and tell me what you need — I'll take care of the AI caption generation.

Try saying:

  • "add a 3-minute talking-head YouTube video into a 1080p MP4"
  • "add captions in English and Spanish with auto-sync"
  • "adding subtitles to YouTube and social media videos for YouTubers and social media creators"

First-Time Connection

When a user first opens this skill, connect to the processing backend automatically. Briefly let them know (e.g. "Setting up...").

Authentication: Check if NEMO_TOKEN is set in the environment. If it is, skip to step 2.

  1. Obtain a free token: Generate a random UUID as client identifier. POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with header X-Client-Id set to that UUID. The response data.token is your NEMO_TOKEN — 100 free credits, valid 7 days.
  2. Create a session: POST to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Authorization: Bearer \x3Ctoken>, Content-Type: application/json, and body {"task_name":"project","language":"\x3Cdetected>"}. Store the returned session_id for all subsequent requests.

Keep setup communication brief. Don't display raw API responses or token values to the user.

AI Video Editor for Captions — Auto-Generate and Burn In Captions

Drop your video clips in the chat and tell me what you need. I'll handle the AI caption generation on cloud GPUs — you don't need anything installed locally.

Here's a typical use: you send a a 3-minute talking-head YouTube video, ask for add captions in English and Spanish with auto-sync, and about 30-60 seconds later you've got a MP4 file ready to download. The whole thing runs at 1080p by default.

One thing worth knowing — shorter clips under 5 minutes generate captions significantly faster.

Matching Input to Actions

User prompts referencing ai video editor for captions, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says... Action Skip SSE?
"export" / "导出" / "download" / "send me the video" → §3.5 Export
"credits" / "积分" / "balance" / "余额" → §3.3 Credits
"status" / "状态" / "show tracks" → §3.4 State
"upload" / "上传" / user sends file → §3.2 Upload
Everything else (generate, edit, add BGM…) → §3.1 SSE

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

Base URL: https://mega-api-prod.nemovideo.ai

Endpoint Method Purpose
/api/tasks/me/with-session/nemo_agent POST Start a new editing session. Body: {"task_name":"project","language":"\x3Clang>"}. Returns session_id.
/run_sse POST Send a user message. Body includes app_name, session_id, new_message. Stream response with Accept: text/event-stream. Timeout: 15 min.
/api/upload-video/nemo_agent/me/\x3Csid> POST Upload a file (multipart) or URL.
/api/credits/balance/simple GET Check remaining credits (available, frozen, total).
/api/state/nemo_agent/me/\x3Csid>/latest GET Fetch current timeline state (draft, video_infos, generated_media).
/api/render/proxy/lambda POST Start export. Body: {"id":"render_\x3Cts>","sessionId":"\x3Csid>","draft":\x3Cjson>,"output":{"format":"mp4","quality":"high"}}. Poll status every 30s.

Accepted file types: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

Skill attribution — read from this file's YAML frontmatter at runtime:

  • X-Skill-Source: ai-video-editor-for-captions
  • X-Skill-Version: from frontmatter version
  • X-Skill-Platform: detect from install path (~/.clawhub/clawhub, ~/.cursor/skills/cursor, else unknown)

Include Authorization: Bearer \x3CNEMO_TOKEN> and all attribution headers on every request — omitting them triggers a 402 on export.

Error Codes

  • 0 — success, continue normally
  • 1001 — token expired or invalid; re-acquire via /api/auth/anonymous-token
  • 1002 — session not found; create a new one
  • 2001 — out of credits; anonymous users get a registration link with ?bind=\x3Cid>, registered users top up
  • 4001 — unsupported file type; show accepted formats
  • 4002 — file too large; suggest compressing or trimming
  • 400 — missing X-Client-Id; generate one and retry
  • 402 — free plan export blocked; not a credit issue, subscription tier
  • 429 — rate limited; wait 30s and retry once

SSE Event Handling

Event Action
Text response Apply GUI translation (§4), present to user
Tool call/result Process internally, don't forward
heartbeat / empty data: Keep waiting. Every 2 min: "⏳ Still working..."
Stream closes Process final response

~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.

Backend Response Translation

The backend assumes a GUI exists. Translate these into API actions:

Backend says You do
"click [button]" / "点击" Execute via API
"open [panel]" / "打开" Query session state
"drag/drop" / "拖拽" Send edit via SSE
"preview in timeline" Show track summary
"Export button" / "导出" Execute export workflow

Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Common Workflows

Quick edit: Upload → "add captions in English and Spanish with auto-sync" → Download MP4. Takes 30-60 seconds for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "add captions in English and Spanish with auto-sync" — concrete instructions get better results.

Max file size is 500MB. Stick to MP4, MOV, AVI, WebM for the smoothest experience.

Export as MP4 for widest compatibility across all platforms.

Usage Guidance
This skill appears to genuinely call a cloud backend to generate and burn-in captions, which explains the NEMO_TOKEN and the network endpoints. Before installing: 1) Verify you trust the domain (mega-api-prod.nemovideo.ai / nemovideo.ai) and their privacy policy — your videos will be uploaded to that service. 2) Ask the author whether the anonymous token or session_id is persistently written to disk (e.g., ~/.config/nemovideo/) or stored only in-memory; prefer ephemeral tokens if you have privacy concerns. 3) Clarify the registry-vs-SKILL.md mismatch about required config paths. 4) If you need stronger guarantees, avoid providing a long-lived NEMO_TOKEN and prefer using a short-lived anonymous token or an account token you can revoke. If you are uncomfortable with automatic outbound auth/network calls or unknown third-party hosting of your media, do not install.
Capability Analysis
Type: OpenClaw Skill Name: ai-video-editor-for-captions Version: 1.0.0 The skill is a functional wrapper for a cloud-based video captioning service (nemovideo.ai). It provides clear instructions for the AI agent to manage authentication, session handling, and video processing via REST API calls. While it requests a specific environment variable (NEMO_TOKEN) and performs automated registration for anonymous tokens, these actions are transparently documented and aligned with the stated purpose of the tool. There is no evidence of data exfiltration, malicious execution, or harmful prompt injection.
Capability Assessment
Purpose & Capability
Name/description align with cloud-based captioning and the SKILL.md directs requests to a nemovideo.ai backend, which is expected. However the SKILL.md's frontmatter declares a config path (~/.config/nemovideo/) while the registry metadata above the file listed no required config paths — this mismatch should be clarified.
Instruction Scope
Instructions routinely send user video and metadata to https://mega-api-prod.nemovideo.ai (upload, SSE, render). That is coherent for a cloud render service, but the skill also instructs the agent to automatically create anonymous tokens and to detect install paths (e.g., ~/.clawhub/, ~/.cursor/skills/) and read this file's frontmatter for attribution headers. Detecting install path implies filesystem checks beyond simply handling an uploaded video; automatic anonymous token generation means the agent will make outbound auth/network calls if NEMO_TOKEN isn't present.
Install Mechanism
No install spec and no code files — instruction-only. This minimizes disk-write/install risk.
Credentials
Only one environment variable (NEMO_TOKEN) is declared, which is appropriate for an API-backed service. However the SKILL.md behavior (generating/storing anonymous token, storing session_id, and frontmatter reference to a config path) implies the skill may persist tokens or session state to a config location; the registry-level metadata contradicted the SKILL.md on required config paths. Confirm where tokens/sessions are stored and how long-lived they are.
Persistence & Privilege
always:false and normal autonomous invocation. The skill uses session tokens that can orphan cloud render jobs if you close the UI, but it does not request permanent platform privileges.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install ai-video-editor-for-captions
  3. After installation, invoke the skill by name or use /ai-video-editor-for-captions
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
AI Video Editor for Captions — Version 1.0.0 - Initial release: auto-generates and burns in captions to uploaded videos (MP4, MOV, AVI, WebM, up to 500MB). - Supports multi-language captions with auto-sync (e.g. English and Spanish). - Cloud-based pipeline: handles upload, editing, captioning, and 1080p MP4 export without local installation. - Interactive workflow: upload clips, give editing instructions via chat, and download captioned videos within ~30–90 seconds. - Tracks credits and session status automatically; provides clear error handling for authentication, file type, and export issues.
Metadata
Slug ai-video-editor-for-captions
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Ai Video Editor For Captions?

Get captioned video files ready to post, without touching a single slider. Upload your video clips (MP4, MOV, AVI, WebM, up to 500MB), say something like "ad... It is an AI Agent Skill for Claude Code / OpenClaw, with 82 downloads so far.

How do I install Ai Video Editor For Captions?

Run "/install ai-video-editor-for-captions" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Ai Video Editor For Captions free?

Yes, Ai Video Editor For Captions is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Ai Video Editor For Captions support?

Ai Video Editor For Captions is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Ai Video Editor For Captions?

It is built and maintained by francemichaell-15 (@francemichaell-15); the current version is v1.0.0.

💬 Comments