功能描述

add video files into captioned video files with this skill. Works with MP4, MOV, AVI, WebM files up to 500MB. content creators use it for adding free automat...

使用说明 (SKILL.md)

Getting Started

Send me your video files and I'll handle the automatic caption generation. Or just describe what you're after.

Try saying:

"add a 3-minute tutorial video recording into a 1080p MP4"
"add free captions to my video automatically"
"adding free automatic captions to videos for content creators"

Automatic Setup

On first interaction, connect to the processing API before doing anything else. Show a brief status like "Setting things up...".

Token: If NEMO_TOKEN environment variable is already set, use it and skip to Session below.

Free token: Generate a UUID as client identifier, then POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with header X-Client-Id: \x3Cuuid>. The response field data.token becomes your NEMO_TOKEN (100 credits, 7-day expiry).

Session: POST to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer auth and body {"task_name":"project"}. Save session_id from the response.

Confirm to the user you're connected and ready. Don't print tokens or raw JSON.

Free Caption — Auto-Generate Captions for Videos

Name: Free Caption
Author: vcarolxhberger

This tool takes your video files and runs automatic caption generation through a cloud rendering pipeline. You upload, describe what you want, and download the result.

Say you have a 3-minute tutorial video recording and want to add free captions to my video automatically — the backend processes it in about 30-60 seconds and hands you a 1080p MP4.

Tip: shorter clips under 2 minutes generate captions fastest.

Matching Input to Actions

User prompts referencing free caption, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says...	Action	Skip SSE?
"export" / "导出" / "download" / "send me the video"	→ §3.5 Export	✅
"credits" / "积分" / "balance" / "余额"	→ §3.3 Credits	✅
"status" / "状态" / "show tracks"	→ §3.4 State	✅
"upload" / "上传" / user sends file	→ §3.2 Upload	✅
Everything else (generate, edit, add BGM…)	→ §3.1 SSE	❌

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

All calls go to https://mega-api-prod.nemovideo.ai. The main endpoints:

Session — POST /api/tasks/me/with-session/nemo_agent with {"task_name":"project","language":"\x3Clang>"}. Gives you a session_id.
Chat (SSE) — POST /run_sse with session_id and your message in new_message.parts[0].text. Set Accept: text/event-stream. Up to 15 min.
Upload — POST /api/upload-video/nemo_agent/me/\x3Csid> — multipart file or JSON with URLs.
Credits — GET /api/credits/balance/simple — returns available, frozen, total.
State — GET /api/state/nemo_agent/me/\x3Csid>/latest — current draft and media info.
Export — POST /api/render/proxy/lambda with render ID and draft JSON. Poll GET /api/render/proxy/lambda/\x3Cid> every 30s for completed status and download URL.

Formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

Three attribution headers are required on every request and must match this file's frontmatter:

Header	Value
`X-Skill-Source`	`free-caption`
`X-Skill-Version`	frontmatter `version`
`X-Skill-Platform`	auto-detect: `clawhub` / `cursor` / `unknown` from install path

All requests must include: Authorization: Bearer \x3CNEMO_TOKEN>, X-Skill-Source, X-Skill-Version, X-Skill-Platform. Missing attribution headers will cause export to fail with 402.

Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Backend Response Translation

The backend assumes a GUI exists. Translate these into API actions:

Backend says	You do
"click [button]" / "点击"	Execute via API
"open [panel]" / "打开"	Query session state
"drag/drop" / "拖拽"	Send edit via SSE
"preview in timeline"	Show track summary
"Export button" / "导出"	Execute export workflow

Reading the SSE Stream

Text events go straight to the user (after GUI translation). Tool calls stay internal. Heartbeats and empty data: lines mean the backend is still working — show "⏳ Still working..." every 2 minutes.

About 30% of edit operations close the stream without any text. When that happens, poll /api/state to confirm the timeline changed, then tell the user what was updated.

Error Codes

0 — success, continue normally
1001 — token expired or invalid; re-acquire via /api/auth/anonymous-token
1002 — session not found; create a new one
2001 — out of credits; anonymous users get a registration link with ?bind=\x3Cid>, registered users top up
4001 — unsupported file type; show accepted formats
4002 — file too large; suggest compressing or trimming
400 — missing X-Client-Id; generate one and retry
402 — free plan export blocked; not a credit issue, subscription tier
429 — rate limited; wait 30s and retry once

Common Workflows

Quick edit: Upload → "add free captions to my video automatically" → Download MP4. Takes 30-60 seconds for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "add free captions to my video automatically" — concrete instructions get better results.

Max file size is 500MB. Stick to MP4, MOV, AVI, WebM for the smoothest experience.

Export as MP4 for widest compatibility.

安全使用建议

Before installing, understand that this skill is cloud-based: it can create a NemoVideo session, use or generate a NEMO_TOKEN, upload selected media, and export rendered videos. It looks purpose-aligned, but avoid using it for confidential videos unless you trust the provider and are comfortable with the external processing.

功能分析

Type: OpenClaw Skill Name: free-caption Version: 1.0.0 The skill is a functional wrapper for a third-party video processing service (nemovideo.ai). It provides detailed instructions for an AI agent to manage authentication, session state, file uploads, and polling for video rendering tasks. While it involves uploading user media to a remote endpoint (mega-api-prod.nemovideo.ai) and requires an API token (NEMO_TOKEN), these behaviors are transparently documented and directly aligned with the stated purpose of providing cloud-based video captioning.

能力评估

ℹ Purpose & Capability

The stated purpose is coherent with the API workflow: SKILL.md says it will run automatic caption generation through a cloud rendering pipeline. Users should note it also describes adjacent video-editing actions such as aspect ratio, text overlays, audio tracks, and export.

ℹ Instruction Scope

SKILL.md instructs the agent to connect to the processing API on first interaction and to translate backend GUI-style responses into API actions. This is purpose-aligned but means the skill may perform service-side workflow steps without showing every raw API action.

ℹ Install Mechanism

There is no install spec and no code files; this is instruction-only. The review therefore rests on SKILL.md behavior, registry metadata, and the disclosed API endpoints rather than inspectable helper code.

ℹ Credentials

Uploading user-provided videos to https://mega-api-prod.nemovideo.ai is proportionate for cloud caption generation, but users should treat uploaded videos as shared with that external provider.

ℹ Persistence & Privilege

The skill uses or creates a NEMO_TOKEN and saves a session_id for the service workflow. Frontmatter also lists ~/.config/nemovideo/ as a config path, although the supplied instructions do not show concrete local reads or writes.

版本历史

v1.0.0

Initial release of Free Caption — Auto-Generate Captions for Videos. - Add automatic video captioning for MP4, MOV, AVI, and WebM files up to 500MB - Supports seamless 30–90 second cloud processing; outputs 1080p MP4 videos - Handles quick edits, multi-file batch processing, and iterative workflows with session state tracking - Simple onboarding: auto-generate or use provided token for API access - Provides clear user feedback, error handling, and status updates throughout the process

元数据

Slug free-caption

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

Free Caption 是什么？

add video files into captioned video files with this skill. Works with MP4, MOV, AVI, WebM files up to 500MB. content creators use it for adding free automat... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 57 次。

如何安装 Free Caption？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install free-caption」即可一键安装，无需额外配置。

Free Caption 是免费的吗？

是的，Free Caption 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

Free Caption 支持哪些平台？

Free Caption 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Free Caption？

由 vcarolxhberger（@vcarolxhberger）开发并维护，当前版本 v1.0.0。

Free Caption