← Back to Skills Marketplace
tk8544-b

Free Subtitle Maker

by tk8544-b · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ Security Clean
107
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install free-subtitle-maker
Description
Drop a video and describe your subtitle style — this free-subtitle-maker skill transcribes your audio, formats the captions, and burns them directly into you...
README (SKILL.md)

Getting Started

Welcome! Ready to add subtitles to your video for free? Upload your mp4, mov, or other video file and tell me your caption preferences — font, color, placement, or just leave it to the defaults — and I'll burn accurate subtitles directly into your footage.

Try saying:

  • "Add white subtitles with a black outline at the bottom of this mp4 tutorial video"
  • "Generate subtitles for my interview clip and use a large bold font so it's easy to read on mobile"
  • "Burn captions into this webm product demo — keep the style clean and minimal, centered at the bottom"

Automatic Setup

On first interaction, connect to the NemoVideo API before doing anything else. Show a brief status like "Setting things up...".

Token: If NEMO_TOKEN environment variable is already set, use it and skip to Session below.

Free token: Read or create a UUID at ~/.config/nemovideo/client_id. Then POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with header X-Client-Id: \x3Cyour-uuid>. The response field data.token becomes your NEMO_TOKEN (100 credits, 7-day expiry). If the token has expired, request a new one with the same Client-Id.

Session: POST to the same host at /api/tasks/me/with-session/nemo_agent with Bearer auth and body {"task_name":"project"}. Save session_id from the response.

Confirm to the user you're connected and ready. Don't print tokens or raw JSON.

Turn Any Video Into a Captioned, Accessible Masterpiece

Getting subtitles onto a video used to mean juggling transcription services, SRT files, and video editors — all before you could share a single clip. This free-subtitle-maker skill collapses that entire process into one step. Upload your video, describe any preferences you have for how the subtitles should look, and walk away with a fully captioned file ready to publish.

The skill listens to your video's audio track, breaks it into timed segments, and overlays clean, readable text directly onto the frames. Whether you're subtitling a tutorial, a short film, a product demo, or a social reel, the output is a polished video file — not a separate caption file you still have to attach somewhere.

This tool is especially valuable for creators working across languages or accessibility requirements. Subtitles increase watch time, improve comprehension for non-native speakers, and make content usable in sound-off environments like social feeds. You don't need an account with a transcription platform or a video editing subscription — just your video and a prompt.

Routing Subtitle Generation Requests

Every user request — whether auto-generating SRT captions, burning hardcoded subtitles, or adjusting font and timing — is parsed and routed to the matching NemoVideo endpoint based on the detected action type.

User says... Action Skip SSE?
"export" / "导出" / "download" / "send me the video" → §3.5 Export
"credits" / "积分" / "balance" / "余额" → §3.3 Credits
"status" / "状态" / "show tracks" → §3.4 State
"upload" / "上传" / user sends file → §3.2 Upload
Everything else (generate, edit, add BGM…) → §3.1 SSE

NemoVideo API Backend Reference

Free Subtitle Maker runs entirely on the NemoVideo backend, which handles speech-to-text transcription, subtitle rendering, and frame-accurate burn-in encoding without requiring any local processing. All subtitle jobs — including multi-language captions and styled text overlays — are queued, processed, and returned as downloadable video files through the NemoVideo pipeline.

Skill attribution — read from this file's YAML frontmatter at runtime:

  • X-Skill-Source: free-subtitle-maker
  • X-Skill-Version: from frontmatter version
  • X-Skill-Platform: detect from install path (~/.clawhub/clawhub, ~/.cursor/skills/cursor, else unknown)

All requests must include: Authorization: Bearer \x3CNEMO_TOKEN>, X-Skill-Source, X-Skill-Version, X-Skill-Platform. Missing attribution headers will cause export to fail with 402.

API base: https://mega-api-prod.nemovideo.ai

Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"\x3Clang>"} — returns task_id, session_id. After creating a session, give the user a link: https://nemovideo.com/workspace/claim?token=$TOKEN&task=\x3Ctask_id>&session=\x3Csession_id>&skill_name=free-subtitle-maker&skill_version=1.0.0&skill_source=\x3Cplatform>

Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"\x3Csid>","new_message":{"parts":[{"text":"\x3Cmsg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.

Upload: POST /api/upload-video/nemo_agent/me/\x3Csid> — file: multipart -F "files=@/path", or URL: {"urls":["\x3Curl>"],"source_type":"url"}

Credits: GET /api/credits/balance/simple — returns available, frozen, total

Session state: GET /api/state/nemo_agent/me/\x3Csid>/latest — key fields: data.state.draft, data.state.video_infos, data.state.generated_media

Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_\x3Cts>","sessionId":"\x3Csid>","draft":\x3Cjson>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/\x3Cid> every 30s until status = completed. Download URL at output.url.

Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

SSE Event Handling

Event Action
Text response Apply GUI translation (§4), present to user
Tool call/result Process internally, don't forward
heartbeat / empty data: Keep waiting. Every 2 min: "⏳ Still working..."
Stream closes Process final response

~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.

Backend Response Translation

The backend assumes a GUI exists. Translate these into API actions:

Backend says You do
"click [button]" / "点击" Execute via API
"open [panel]" / "打开" Query session state
"drag/drop" / "拖拽" Send edit via SSE
"preview in timeline" Show track summary
"Export button" / "导出" Execute export workflow

Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Error Handling

Code Meaning Action
0 Success Continue
1001 Bad/expired token Re-auth via anonymous-token (tokens expire after 7 days)
1002 Session not found New session §3.0
2001 No credits Anonymous: show registration URL with ?bind=\x3Cid> (get \x3Cid> from create-session or state response when needed). Registered: "Top up at nemovideo.ai"
4001 Unsupported file Show supported formats
4002 File too large Suggest compress/trim
400 Missing X-Client-Id Generate Client-Id and retry (see §1)
402 Free plan export blocked Subscription tier issue, NOT credits. "Register at nemovideo.ai to unlock export."
429 Rate limit (1 token/client/7 days) Retry in 30s once

Use Cases

This free-subtitle-maker skill fits naturally into a wide range of real-world workflows. Educators recording lecture videos or screencasts can add subtitles to make content accessible to students with hearing impairments or those studying in a second language. A single upload-and-prompt workflow replaces what used to require dedicated captioning software.

Social media managers handling short-form video content — product showcases, testimonials, behind-the-scenes clips — can subtitle entire batches of content quickly. Since most social video is watched without sound, burned-in subtitles are often more reliable than platform-generated captions that viewers have to manually enable.

Independent filmmakers and video journalists use subtitle tools to prepare rough cuts for review, add captions to interview footage, or create accessible versions of documentary content. This skill handles all of those scenarios without requiring a paid subscription to a dedicated captioning platform or hours spent in a timeline editor.

Performance Notes

Subtitle accuracy depends heavily on audio clarity. Videos with clean, single-speaker dialogue and minimal background noise will produce the most accurate transcriptions with little to no correction needed. Crowded environments, heavy accents, or overlapping speakers may result in occasional errors in the generated captions — reviewing the output before publishing is always a good habit.

File size and video length affect processing time. Shorter clips under five minutes process quickly, while longer files or high-resolution source videos may take additional time to complete. For best results, upload the highest-quality audio version of your video rather than a heavily compressed copy.

Subtitle positioning and font rendering are optimized for standard 16:9 aspect ratios. Vertical videos (9:16) used for Reels or TikTok are supported, but you may want to specify a higher vertical placement in your prompt to avoid overlap with platform UI elements.

Usage Guidance
This skill uploads your video/audio to the NemoVideo service and requires an API token (NEMO_TOKEN) or will generate a short-lived anonymous token. Before installing: 1) Confirm you trust https://nemovideo.com and review its privacy/terms because uploaded media and transcripts will leave your device. 2) Prefer using an anonymous token when possible (short-lived) for sensitive content. 3) Be aware the skill builds a claim URL that places the token in the query string — avoid sharing that link and treat it like a secret. 4) Check that storing ~/.config/nemovideo/client_id on your machine is acceptable. If any of these are unacceptable, do not install or use the skill.
Capability Analysis
Type: OpenClaw Skill Name: free-subtitle-maker Version: 1.0.0 The free-subtitle-maker skill provides a functional integration with the NemoVideo API for automated video captioning. It includes detailed instructions for the agent to manage authentication via a local configuration file (~/.config/nemovideo/client_id), handle long-running tasks via Server-Sent Events (SSE), and perform video uploads and exports. While it requires network access and local file system permissions to store its client ID and process videos, these actions are directly aligned with its stated purpose, and no evidence of malicious intent or data exfiltration was found.
Capability Assessment
Purpose & Capability
The skill claims to run subtitles via the NemoVideo backend and only requests a single credential (NEMO_TOKEN) plus a per-user client_id config path; these align with a remote transcription/encoding service. No unrelated credentials, binaries, or installs are requested.
Instruction Scope
SKILL.md instructs the agent to create/read ~/.config/nemovideo/client_id, obtain or use a NEMO_TOKEN, create sessions, upload videos, and stream SSE messages to the NemoVideo API—all consistent with the described purpose. It also asks the agent to read the SKILL.md frontmatter and detect the agent install path to set X-Skill-Platform; reading those paths is unnecessary for core functionality but low-risk. Important privacy note: the workflow builds a claim URL that includes the token in the query string (token=$TOKEN), which can expose auth tokens if the URL is shared or logged.
Install Mechanism
Instruction-only skill with no install steps and no code files. This minimizes on-disk changes and is proportionate to the described functionality.
Credentials
Only NEMO_TOKEN is declared as required and is justified by the remote API calls. The skill will also create/read ~/.config/nemovideo/client_id for anonymous tokens. Those items are proportional, but embedding tokens in a claim URL (and storing client_id locally) creates token-handling/privacy considerations that users should accept knowingly.
Persistence & Privilege
always:false (no forced global inclusion). The skill stores per-user client_id in its own config path (~/.config/nemovideo/) and does not request system-wide settings or other skills' credentials.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install free-subtitle-maker
  3. After installation, invoke the skill by name or use /free-subtitle-maker
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
- Initial release of Free Subtitle Maker — auto-generate and burn subtitles into any video for free. - Supports mp4, mov, avi, webm, mkv (and other formats). - Customize subtitle style: font, size, placement, color, or use defaults. - No paid software, file juggling, or separate SRT—get a ready-to-share captioned video file. - Seamless workflow: upload video, set preferences, and receive a fully processed file via the NemoVideo backend. - Handles session, authentication, credits, and error checking automatically.
Metadata
Slug free-subtitle-maker
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Free Subtitle Maker?

Drop a video and describe your subtitle style — this free-subtitle-maker skill transcribes your audio, formats the captions, and burns them directly into you... It is an AI Agent Skill for Claude Code / OpenClaw, with 107 downloads so far.

How do I install Free Subtitle Maker?

Run "/install free-subtitle-maker" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Free Subtitle Maker free?

Yes, Free Subtitle Maker is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Free Subtitle Maker support?

Free Subtitle Maker is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Free Subtitle Maker?

It is built and maintained by tk8544-b (@tk8544-b); the current version is v1.0.0.

💬 Comments