← Back to Skills Marketplace
vynbosserman65

Ai Ai Subtitle Generator

by vynbosserman65 · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
93
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install ai-ai-subtitle-generator
Description
Turn a 3-minute YouTube tutorial video into 1080p captioned video files just by typing what you need. Whether it's adding auto-generated subtitles to videos...
README (SKILL.md)

Getting Started

Send me your video files and I'll handle the AI subtitle generation. Or just describe what you're after.

Try saying:

  • "generate a 3-minute YouTube tutorial video into a 1080p MP4"
  • "generate subtitles in English and Japanese and burn them into the video"
  • "adding auto-generated subtitles to videos for YouTubers, content creators, educators"

Getting Connected

Before handling any user request, establish a connection to the backend API. Show a brief status like "Connecting...".

If NEMO_TOKEN is in the environment, use it directly and create a session. Otherwise, acquire a free starter token:

  • Generate a UUID as client identifier
  • POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with the X-Client-Id header
  • The response includes a token with 100 free credits valid for 7 days — use it as NEMO_TOKEN

Then create a session by POSTing to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer authorization and body {"task_name":"project","language":"en"}. The session_id in the response is needed for all following requests.

Tell the user you're ready. Keep the technical details out of the chat.

AI Subtitle Generator — Auto-Generate and Embed Video Subtitles

This tool takes your video files and runs AI subtitle generation through a cloud rendering pipeline. You upload, describe what you want, and download the result.

Say you have a 3-minute YouTube tutorial video and want to generate subtitles in English and Japanese and burn them into the video — the backend processes it in about 30-60 seconds and hands you a 1080p MP4.

Tip: shorter clips under 5 minutes produce the most accurate subtitle timing.

Matching Input to Actions

User prompts referencing ai ai subtitle generator, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says... Action Skip SSE?
"export" / "导出" / "download" / "send me the video" → §3.5 Export
"credits" / "积分" / "balance" / "余额" → §3.3 Credits
"status" / "状态" / "show tracks" → §3.4 State
"upload" / "上传" / user sends file → §3.2 Upload
Everything else (generate, edit, add BGM…) → §3.1 SSE

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

All calls go to https://mega-api-prod.nemovideo.ai. The main endpoints:

  1. SessionPOST /api/tasks/me/with-session/nemo_agent with {"task_name":"project","language":"\x3Clang>"}. Gives you a session_id.
  2. Chat (SSE)POST /run_sse with session_id and your message in new_message.parts[0].text. Set Accept: text/event-stream. Up to 15 min.
  3. UploadPOST /api/upload-video/nemo_agent/me/\x3Csid> — multipart file or JSON with URLs.
  4. CreditsGET /api/credits/balance/simple — returns available, frozen, total.
  5. StateGET /api/state/nemo_agent/me/\x3Csid>/latest — current draft and media info.
  6. ExportPOST /api/render/proxy/lambda with render ID and draft JSON. Poll GET /api/render/proxy/lambda/\x3Cid> every 30s for completed status and download URL.

Formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

Skill attribution — read from this file's YAML frontmatter at runtime:

  • X-Skill-Source: ai-ai-subtitle-generator
  • X-Skill-Version: from frontmatter version
  • X-Skill-Platform: detect from install path (~/.clawhub/clawhub, ~/.cursor/skills/cursor, else unknown)

Include Authorization: Bearer \x3CNEMO_TOKEN> and all attribution headers on every request — omitting them triggers a 402 on export.

Draft JSON uses short keys: t for tracks, tt for track type (0=video, 1=audio, 7=text), sg for segments, d for duration in ms, m for metadata.

Example timeline summary:

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Backend Response Translation

The backend assumes a GUI exists. Translate these into API actions:

Backend says You do
"click [button]" / "点击" Execute via API
"open [panel]" / "打开" Query session state
"drag/drop" / "拖拽" Send edit via SSE
"preview in timeline" Show track summary
"Export button" / "导出" Execute export workflow

Reading the SSE Stream

Text events go straight to the user (after GUI translation). Tool calls stay internal. Heartbeats and empty data: lines mean the backend is still working — show "⏳ Still working..." every 2 minutes.

About 30% of edit operations close the stream without any text. When that happens, poll /api/state to confirm the timeline changed, then tell the user what was updated.

Error Handling

Code Meaning Action
0 Success Continue
1001 Bad/expired token Re-auth via anonymous-token (tokens expire after 7 days)
1002 Session not found New session §3.0
2001 No credits Anonymous: show registration URL with ?bind=\x3Cid> (get \x3Cid> from create-session or state response when needed). Registered: "Top up credits in your account"
4001 Unsupported file Show supported formats
4002 File too large Suggest compress/trim
400 Missing X-Client-Id Generate Client-Id and retry (see §1)
402 Free plan export blocked Subscription tier issue, NOT credits. "Register or upgrade your plan to unlock export."
429 Rate limit (1 token/client/7 days) Retry in 30s once

Common Workflows

Quick edit: Upload → "generate subtitles in English and Japanese and burn them into the video" → Download MP4. Takes 30-60 seconds for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "generate subtitles in English and Japanese and burn them into the video" — concrete instructions get better results.

Max file size is 500MB. Stick to MP4, MOV, AVI, WebM for the smoothest experience.

Export as MP4 for widest compatibility across platforms and devices.

Usage Guidance
Key things to consider before installing or enabling this skill: - Source verification: The skill has no homepage and an unknown source. Prefer skills from known publishers. Ask the publisher for provenance or a code repo. - Token handling: The skill requires NEMO_TOKEN. If you provide a long-lived token in your environment, the skill will send it to mega-api-prod.nemovideo.ai for all requests. Consider using a short-lived or scoped token, or prefer anonymous tokens if available. Do not supply sensitive, multi-service credentials. - Filesystem probing: The SKILL.md instructs the agent to inspect install/config paths (~/.config/nemovideo/ and install directories) to construct attribution headers — this could expose unrelated local configuration. Ask the maintainer why that is needed and request an option to disable local path checks. - Hidden behavior: The instruction to 'keep the technical details out of the chat' means the agent may not show full request/response details. If you need auditability, require verbose logging or explicit opt-in before outbound requests. - Network endpoints: All network calls target https://mega-api-prod.nemovideo.ai. If you plan to use it, verify the domain and privacy/data-retention practices of that service. If you need to proceed: limit exposure by (1) using a minimally privileged/anonymous token, (2) running the skill in an environment/container with no extra credentials or sensitive files, and (3) requesting the skill author to remove or explain filesystem checks and to surface operation logs. Additional information that would raise confidence to 'high': a known, signed source repo or homepage; clarification/consistency about required config paths in registry metadata; and a version of the SKILL.md that omits filesystem probing or makes it optional.
Capability Analysis
Type: OpenClaw Skill Name: ai-ai-subtitle-generator Version: 1.0.0 The skill bundle is a legitimate integration for the NemoVideo AI service, designed to automate video subtitling and rendering. It provides the AI agent with detailed instructions for managing sessions, uploading media, and polling for render status via the 'mega-api-prod.nemovideo.ai' backend. The requested permissions (environment variables and config paths) and network activities are strictly aligned with the stated purpose, and there is no evidence of data exfiltration, malicious execution, or harmful prompt injection.
Capability Assessment
Purpose & Capability
The skill's functionality (cloud subtitle/render service) aligns with requiring a single service token (NEMO_TOKEN). However, the SKILL.md frontmatter requests access to a local config path (~/.config/nemovideo/) and later asks the agent to inspect install paths to set an attribution header, while the registry metadata lists no required config paths. This mismatch is incoherent and not explained by the stated purpose.
Instruction Scope
The instructions direct the agent to: use or obtain a token from an external API, create sessions, upload user video files, stream SSE, poll render status, and return download URLs — all consistent with the claimed service. Concerns: (1) the skill tells the agent to detect install path and set X-Skill-Platform based on local paths (requires filesystem probing), and (2) it explicitly instructs to 'keep the technical details out of the chat,' which grants the agent discretion to hide operational actions from users. Both expand scope beyond a simple API client and could expose unrelated local data or obscure behavior.
Install Mechanism
No install spec or code files are present (instruction-only). That minimizes on-disk risk because nothing is downloaded or executed by default.
Credentials
The only declared required credential is NEMO_TOKEN (primaryEnv), which is proportionate for a cloud subtitle service. However, the SKILL.md instructs reading local config and install paths (potentially exposing other tokens/config stored in those locations). There's also a discrepancy between the registry metadata (no config paths) and SKILL.md frontmatter (lists ~/.config/nemovideo/), which is unexplained and increases risk of inadvertent credential access.
Persistence & Privilege
The skill is not always-included and uses default autonomous-invocation settings. Autonomous invocation is platform default and not flagged alone, but combined with the instruction to hide technical details and the filesystem probing behavior above, the agent could perform networked actions without surfacing full logs to the user. The skill does not request system-wide configuration changes.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install ai-ai-subtitle-generator
  3. After installation, invoke the skill by name or use /ai-ai-subtitle-generator
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release of AI Subtitle Generator — instantly add auto-generated subtitles and captions to your videos via cloud rendering. - Upload videos, describe your output (e.g., subtitles in multiple languages, aspect ratio, overlays), and receive a captioned 1080p video in ~30–60 seconds. - Automatic connection flow: uses `NEMO_TOKEN` if present, or registers for a free trial token with 100 credits. - Supports English and Japanese subtitle generation; outputs to popular video and audio formats (MP4, MOV, AVI, etc). - Handles common requests for export, balance/credits, state/status, and uploads via chat commands. - Robust error handling for session, token expiry, credits, and file compatibility issues. - Allows iterative and batch editing workflows, with seamless download after processing.
Metadata
Slug ai-ai-subtitle-generator
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Ai Ai Subtitle Generator?

Turn a 3-minute YouTube tutorial video into 1080p captioned video files just by typing what you need. Whether it's adding auto-generated subtitles to videos... It is an AI Agent Skill for Claude Code / OpenClaw, with 93 downloads so far.

How do I install Ai Ai Subtitle Generator?

Run "/install ai-ai-subtitle-generator" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Ai Ai Subtitle Generator free?

Yes, Ai Ai Subtitle Generator is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Ai Ai Subtitle Generator support?

Ai Ai Subtitle Generator is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Ai Ai Subtitle Generator?

It is built and maintained by vynbosserman65 (@vynbosserman65); the current version is v1.0.0.

💬 Comments