← 返回 Skills 市场
vynbosserman65

Ai Ai Subtitle Generator

作者 vynbosserman65 · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
93
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install ai-ai-subtitle-generator
功能描述
Turn a 3-minute YouTube tutorial video into 1080p captioned video files just by typing what you need. Whether it's adding auto-generated subtitles to videos...
使用说明 (SKILL.md)

Getting Started

Send me your video files and I'll handle the AI subtitle generation. Or just describe what you're after.

Try saying:

  • "generate a 3-minute YouTube tutorial video into a 1080p MP4"
  • "generate subtitles in English and Japanese and burn them into the video"
  • "adding auto-generated subtitles to videos for YouTubers, content creators, educators"

Getting Connected

Before handling any user request, establish a connection to the backend API. Show a brief status like "Connecting...".

If NEMO_TOKEN is in the environment, use it directly and create a session. Otherwise, acquire a free starter token:

  • Generate a UUID as client identifier
  • POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with the X-Client-Id header
  • The response includes a token with 100 free credits valid for 7 days — use it as NEMO_TOKEN

Then create a session by POSTing to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer authorization and body {"task_name":"project","language":"en"}. The session_id in the response is needed for all following requests.

Tell the user you're ready. Keep the technical details out of the chat.

AI Subtitle Generator — Auto-Generate and Embed Video Subtitles

This tool takes your video files and runs AI subtitle generation through a cloud rendering pipeline. You upload, describe what you want, and download the result.

Say you have a 3-minute YouTube tutorial video and want to generate subtitles in English and Japanese and burn them into the video — the backend processes it in about 30-60 seconds and hands you a 1080p MP4.

Tip: shorter clips under 5 minutes produce the most accurate subtitle timing.

Matching Input to Actions

User prompts referencing ai ai subtitle generator, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says... Action Skip SSE?
"export" / "导出" / "download" / "send me the video" → §3.5 Export
"credits" / "积分" / "balance" / "余额" → §3.3 Credits
"status" / "状态" / "show tracks" → §3.4 State
"upload" / "上传" / user sends file → §3.2 Upload
Everything else (generate, edit, add BGM…) → §3.1 SSE

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

All calls go to https://mega-api-prod.nemovideo.ai. The main endpoints:

  1. SessionPOST /api/tasks/me/with-session/nemo_agent with {"task_name":"project","language":"\x3Clang>"}. Gives you a session_id.
  2. Chat (SSE)POST /run_sse with session_id and your message in new_message.parts[0].text. Set Accept: text/event-stream. Up to 15 min.
  3. UploadPOST /api/upload-video/nemo_agent/me/\x3Csid> — multipart file or JSON with URLs.
  4. CreditsGET /api/credits/balance/simple — returns available, frozen, total.
  5. StateGET /api/state/nemo_agent/me/\x3Csid>/latest — current draft and media info.
  6. ExportPOST /api/render/proxy/lambda with render ID and draft JSON. Poll GET /api/render/proxy/lambda/\x3Cid> every 30s for completed status and download URL.

Formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

Skill attribution — read from this file's YAML frontmatter at runtime:

  • X-Skill-Source: ai-ai-subtitle-generator
  • X-Skill-Version: from frontmatter version
  • X-Skill-Platform: detect from install path (~/.clawhub/clawhub, ~/.cursor/skills/cursor, else unknown)

Include Authorization: Bearer \x3CNEMO_TOKEN> and all attribution headers on every request — omitting them triggers a 402 on export.

Draft JSON uses short keys: t for tracks, tt for track type (0=video, 1=audio, 7=text), sg for segments, d for duration in ms, m for metadata.

Example timeline summary:

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Backend Response Translation

The backend assumes a GUI exists. Translate these into API actions:

Backend says You do
"click [button]" / "点击" Execute via API
"open [panel]" / "打开" Query session state
"drag/drop" / "拖拽" Send edit via SSE
"preview in timeline" Show track summary
"Export button" / "导出" Execute export workflow

Reading the SSE Stream

Text events go straight to the user (after GUI translation). Tool calls stay internal. Heartbeats and empty data: lines mean the backend is still working — show "⏳ Still working..." every 2 minutes.

About 30% of edit operations close the stream without any text. When that happens, poll /api/state to confirm the timeline changed, then tell the user what was updated.

Error Handling

Code Meaning Action
0 Success Continue
1001 Bad/expired token Re-auth via anonymous-token (tokens expire after 7 days)
1002 Session not found New session §3.0
2001 No credits Anonymous: show registration URL with ?bind=\x3Cid> (get \x3Cid> from create-session or state response when needed). Registered: "Top up credits in your account"
4001 Unsupported file Show supported formats
4002 File too large Suggest compress/trim
400 Missing X-Client-Id Generate Client-Id and retry (see §1)
402 Free plan export blocked Subscription tier issue, NOT credits. "Register or upgrade your plan to unlock export."
429 Rate limit (1 token/client/7 days) Retry in 30s once

Common Workflows

Quick edit: Upload → "generate subtitles in English and Japanese and burn them into the video" → Download MP4. Takes 30-60 seconds for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "generate subtitles in English and Japanese and burn them into the video" — concrete instructions get better results.

Max file size is 500MB. Stick to MP4, MOV, AVI, WebM for the smoothest experience.

Export as MP4 for widest compatibility across platforms and devices.

安全使用建议
Key things to consider before installing or enabling this skill: - Source verification: The skill has no homepage and an unknown source. Prefer skills from known publishers. Ask the publisher for provenance or a code repo. - Token handling: The skill requires NEMO_TOKEN. If you provide a long-lived token in your environment, the skill will send it to mega-api-prod.nemovideo.ai for all requests. Consider using a short-lived or scoped token, or prefer anonymous tokens if available. Do not supply sensitive, multi-service credentials. - Filesystem probing: The SKILL.md instructs the agent to inspect install/config paths (~/.config/nemovideo/ and install directories) to construct attribution headers — this could expose unrelated local configuration. Ask the maintainer why that is needed and request an option to disable local path checks. - Hidden behavior: The instruction to 'keep the technical details out of the chat' means the agent may not show full request/response details. If you need auditability, require verbose logging or explicit opt-in before outbound requests. - Network endpoints: All network calls target https://mega-api-prod.nemovideo.ai. If you plan to use it, verify the domain and privacy/data-retention practices of that service. If you need to proceed: limit exposure by (1) using a minimally privileged/anonymous token, (2) running the skill in an environment/container with no extra credentials or sensitive files, and (3) requesting the skill author to remove or explain filesystem checks and to surface operation logs. Additional information that would raise confidence to 'high': a known, signed source repo or homepage; clarification/consistency about required config paths in registry metadata; and a version of the SKILL.md that omits filesystem probing or makes it optional.
功能分析
Type: OpenClaw Skill Name: ai-ai-subtitle-generator Version: 1.0.0 The skill bundle is a legitimate integration for the NemoVideo AI service, designed to automate video subtitling and rendering. It provides the AI agent with detailed instructions for managing sessions, uploading media, and polling for render status via the 'mega-api-prod.nemovideo.ai' backend. The requested permissions (environment variables and config paths) and network activities are strictly aligned with the stated purpose, and there is no evidence of data exfiltration, malicious execution, or harmful prompt injection.
能力评估
Purpose & Capability
The skill's functionality (cloud subtitle/render service) aligns with requiring a single service token (NEMO_TOKEN). However, the SKILL.md frontmatter requests access to a local config path (~/.config/nemovideo/) and later asks the agent to inspect install paths to set an attribution header, while the registry metadata lists no required config paths. This mismatch is incoherent and not explained by the stated purpose.
Instruction Scope
The instructions direct the agent to: use or obtain a token from an external API, create sessions, upload user video files, stream SSE, poll render status, and return download URLs — all consistent with the claimed service. Concerns: (1) the skill tells the agent to detect install path and set X-Skill-Platform based on local paths (requires filesystem probing), and (2) it explicitly instructs to 'keep the technical details out of the chat,' which grants the agent discretion to hide operational actions from users. Both expand scope beyond a simple API client and could expose unrelated local data or obscure behavior.
Install Mechanism
No install spec or code files are present (instruction-only). That minimizes on-disk risk because nothing is downloaded or executed by default.
Credentials
The only declared required credential is NEMO_TOKEN (primaryEnv), which is proportionate for a cloud subtitle service. However, the SKILL.md instructs reading local config and install paths (potentially exposing other tokens/config stored in those locations). There's also a discrepancy between the registry metadata (no config paths) and SKILL.md frontmatter (lists ~/.config/nemovideo/), which is unexplained and increases risk of inadvertent credential access.
Persistence & Privilege
The skill is not always-included and uses default autonomous-invocation settings. Autonomous invocation is platform default and not flagged alone, but combined with the instruction to hide technical details and the filesystem probing behavior above, the agent could perform networked actions without surfacing full logs to the user. The skill does not request system-wide configuration changes.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install ai-ai-subtitle-generator
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /ai-ai-subtitle-generator 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release of AI Subtitle Generator — instantly add auto-generated subtitles and captions to your videos via cloud rendering. - Upload videos, describe your output (e.g., subtitles in multiple languages, aspect ratio, overlays), and receive a captioned 1080p video in ~30–60 seconds. - Automatic connection flow: uses `NEMO_TOKEN` if present, or registers for a free trial token with 100 credits. - Supports English and Japanese subtitle generation; outputs to popular video and audio formats (MP4, MOV, AVI, etc). - Handles common requests for export, balance/credits, state/status, and uploads via chat commands. - Robust error handling for session, token expiry, credits, and file compatibility issues. - Allows iterative and batch editing workflows, with seamless download after processing.
元数据
Slug ai-ai-subtitle-generator
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

Ai Ai Subtitle Generator 是什么?

Turn a 3-minute YouTube tutorial video into 1080p captioned video files just by typing what you need. Whether it's adding auto-generated subtitles to videos... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 93 次。

如何安装 Ai Ai Subtitle Generator?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install ai-ai-subtitle-generator」即可一键安装,无需额外配置。

Ai Ai Subtitle Generator 是免费的吗?

是的,Ai Ai Subtitle Generator 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Ai Ai Subtitle Generator 支持哪些平台?

Ai Ai Subtitle Generator 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Ai Ai Subtitle Generator?

由 vynbosserman65(@vynbosserman65)开发并维护,当前版本 v1.0.0。

💬 留言讨论