Description

Turn a 3-minute YouTube video in MP4 format into 1080p captioned video files just by typing what you need. Whether it's adding subtitles to videos automatica...

README (SKILL.md)

Getting Started

Share your video files and I'll get started on AI subtitle generation. Or just tell me what you're thinking.

Try saying:

"add my video files"
"export 1080p MP4"
"generate subtitles in English and add"

Getting Connected

Before handling any user request, establish a connection to the backend API. Show a brief status like "Connecting...".

If NEMO_TOKEN is in the environment, use it directly and create a session. Otherwise, acquire a free starter token:

Generate a UUID as client identifier
POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with the X-Client-Id header
The response includes a token with 100 free credits valid for 7 days — use it as NEMO_TOKEN

Then create a session by POSTing to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer authorization and body {"task_name":"project","language":"en"}. The session_id in the response is needed for all following requests.

Tell the user you're ready. Keep the technical details out of the chat.

AI Subtitles Free — Generate and Embed Video Captions

Name: Ai Subtitles Free
Author: vcarolxhberger

Send me your video files and describe the result you want. The AI subtitle generation runs on remote GPU nodes — nothing to install on your machine.

A quick example: upload a 3-minute YouTube video in MP4 format, type "generate subtitles in English and add them to my video for free", and you'll get a 1080p MP4 back in roughly 30-60 seconds. All rendering happens server-side.

Worth noting: shorter clips under 5 minutes generate subtitles significantly faster.

Matching Input to Actions

User prompts referencing ai subtitles free, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says...	Action	Skip SSE?
"export" / "导出" / "download" / "send me the video"	→ §3.5 Export	✅
"credits" / "积分" / "balance" / "余额"	→ §3.3 Credits	✅
"status" / "状态" / "show tracks"	→ §3.4 State	✅
"upload" / "上传" / user sends file	→ §3.2 Upload	✅
Everything else (generate, edit, add BGM…)	→ §3.1 SSE	❌

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

All calls go to https://mega-api-prod.nemovideo.ai. The main endpoints:

Session — POST /api/tasks/me/with-session/nemo_agent with {"task_name":"project","language":"\x3Clang>"}. Gives you a session_id.
Chat (SSE) — POST /run_sse with session_id and your message in new_message.parts[0].text. Set Accept: text/event-stream. Up to 15 min.
Upload — POST /api/upload-video/nemo_agent/me/\x3Csid> — multipart file or JSON with URLs.
Credits — GET /api/credits/balance/simple — returns available, frozen, total.
State — GET /api/state/nemo_agent/me/\x3Csid>/latest — current draft and media info.
Export — POST /api/render/proxy/lambda with render ID and draft JSON. Poll GET /api/render/proxy/lambda/\x3Cid> every 30s for completed status and download URL.

Formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

Skill attribution — read from this file's YAML frontmatter at runtime:

X-Skill-Source: ai-subtitles-free
X-Skill-Version: from frontmatter version
X-Skill-Platform: detect from install path (~/.clawhub/ → clawhub, ~/.cursor/skills/ → cursor, else unknown)

Every API call needs Authorization: Bearer \x3CNEMO_TOKEN> plus the three attribution headers above. If any header is missing, exports return 402.

Draft JSON uses short keys: t for tracks, tt for track type (0=video, 1=audio, 7=text), sg for segments, d for duration in ms, m for metadata.

Example timeline summary:

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Translating GUI Instructions

The backend responds as if there's a visual interface. Map its instructions to API calls:

"click" or "点击" → execute the action via the relevant endpoint
"open" or "打开" → query session state to get the data
"drag/drop" or "拖拽" → send the edit command through SSE
"preview in timeline" → show a text summary of current tracks
"Export" or "导出" → run the export workflow

SSE Event Handling

Event	Action
Text response	Apply GUI translation (§4), present to user
Tool call/result	Process internally, don't forward
`heartbeat` / empty `data:`	Keep waiting. Every 2 min: "⏳ Still working..."
Stream closes	Process final response

~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.

Error Codes

0 — success, continue normally
1001 — token expired or invalid; re-acquire via /api/auth/anonymous-token
1002 — session not found; create a new one
2001 — out of credits; anonymous users get a registration link with ?bind=\x3Cid>, registered users top up
4001 — unsupported file type; show accepted formats
4002 — file too large; suggest compressing or trimming
400 — missing X-Client-Id; generate one and retry
402 — free plan export blocked; not a credit issue, subscription tier
429 — rate limited; wait 30s and retry once

Common Workflows

Quick edit: Upload → "generate subtitles in English and add them to my video for free" → Download MP4. Takes 30-60 seconds for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "generate subtitles in English and add them to my video for free" — concrete instructions get better results.

Max file size is 500MB. Stick to MP4, MOV, AVI, WebM for the smoothest experience.

Export as MP4 for widest compatibility across platforms and devices.

Usage Guidance

This skill appears to do what it says: it uploads videos to nemovideo.ai to produce captioned MP4s and needs a NEMO_TOKEN (or it will request a temporary anonymous token). Before installing, consider: (1) Privacy — video files will be uploaded to an external service; do not upload sensitive content unless you trust nemovideo.ai and reviewed its retention/privacy policy. (2) Credentials — only provide NEMO_TOKEN if it is specifically for this service; the skill can also request an anonymous token automatically. (3) Filesystem access — the skill asks the agent to read the skill frontmatter and check a couple of install-path prefixes to set attribution headers; while limited, this reads your home dir and is unnecessary for core functionality. (4) Metadata mismatch — the registry metadata and SKILL.md disagree about config paths; verify which is correct. If you need a stricter review, ask for the exact network request bodies and a privacy/retention policy or the service owner identity (source/homepage is unknown).

Capability Analysis

Type: OpenClaw Skill Name: ai-subtitles-free Version: 1.0.0 The ai-subtitles-free skill (defined in SKILL.md and _meta.json) is a functional integration for the NemoVideo AI service, allowing users to generate and embed video captions. It provides the agent with specific instructions for session management, anonymous token acquisition, and handling multipart uploads to 'https://mega-api-prod.nemovideo.ai'. While the skill includes telemetry via custom headers (X-Skill-Platform) and sends user media to a third-party API, these behaviors are transparently documented and directly support the stated purpose of cloud-based video processing without evidence of malicious intent or unauthorized data exfiltration.

Capability Assessment

ℹ Purpose & Capability

The skill name/description (generate and embed subtitles) aligns with the single required credential (NEMO_TOKEN) and the API endpoints described. Minor inconsistency: registry metadata at the top of the report listed no required config paths, while the SKILL.md frontmatter includes a configPaths value (~/.config/nemovideo/). This is likely a metadata mismatch but should be reconciled.

ℹ Instruction Scope

SKILL.md is an instruction-only runtime spec that limits actions to: obtain/use NEMO_TOKEN (or request an anonymous token), create a session, upload video files, call render/export endpoints, and poll status. It does instruct the agent to read this file's YAML frontmatter at runtime and to detect install path prefixes (~/.clawhub/, ~/.cursor/skills/) to set an attribution header — these are limited filesystem checks but are not strictly necessary for subtitle generation and are worth noting.

✓ Install Mechanism

There is no install spec and no code files; this is instruction-only and thus does not write or execute new code on disk. That is the lowest-risk install model.

✓ Credentials

The only required environment variable is NEMO_TOKEN (declared as primaryEnv). The SKILL.md provides a fallback flow to obtain an anonymous token via the service's auth endpoint if NEMO_TOKEN is absent. No unrelated secrets or multiple credentials are requested.

✓ Persistence & Privilege

always is false and the skill does not request elevated persistent platform privileges or modification of other skills. It will make network calls to the external rendering service and may poll for job status, which is expected for this function.

Version History

v1.0.0

Initial release of AI Subtitles Free — automate subtitle generation and embedding for video files. - Instantly generate and embed AI subtitles for videos up to 3 minutes, supporting various formats (MP4, MOV, AVI, WebM, etc.). - No installation or timeline editing needed; just upload and describe the results you want for quick cloud-rendered 1080p videos (30–60s). - Easy session/credit system with 100 free credits for new users; transparent feedback for status and errors. - API-driven editing, export, and state management with intuitive text commands (e.g., "generate subtitles in English and add"). - Error handling covers authentication, credits, file types, and rate limits; recovery steps provided in chat. - Supports batch processing, iterative editing with previews, and direct download of final videos.

Metadata

Slug ai-subtitles-free

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Ai Subtitles Free?

Turn a 3-minute YouTube video in MP4 format into 1080p captioned video files just by typing what you need. Whether it's adding subtitles to videos automatica... It is an AI Agent Skill for Claude Code / OpenClaw, with 76 downloads so far.

How do I install Ai Subtitles Free?

Run "/install ai-subtitles-free" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Ai Subtitles Free free?

Yes, Ai Subtitles Free is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Ai Subtitles Free support?

Ai Subtitles Free is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Ai Subtitles Free?

It is built and maintained by vcarolxhberger (@vcarolxhberger); the current version is v1.0.0.

More Skills

Ai Subtitles Free