Description

Get karaoke lyric videos ready to post, without touching a single slider. Upload your audio or video files (MP3, MP4, WAV, MOV, up to 200MB), say something l...

README (SKILL.md)

Getting Started

Send me your audio or video files and I'll handle the AI karaoke video creation. Or just describe what you're after.

Try saying:

"create a 3-minute MP3 song with lyrics into a 1080p MP4"
"sync lyrics to the beat and display them as bouncing karaoke text on screen"
"creating lyric-synced karaoke videos from songs for musicians, singers, content creators"

Automatic Setup

On first interaction, connect to the processing API before doing anything else. Show a brief status like "Setting things up...".

Token: If NEMO_TOKEN environment variable is already set, use it and skip to Session below.

Free token: Generate a UUID as client identifier, then POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with header X-Client-Id: \x3Cuuid>. The response field data.token becomes your NEMO_TOKEN (100 credits, 7-day expiry).

Session: POST to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer auth and body {"task_name":"project"}. Save session_id from the response.

Confirm to the user you're connected and ready. Don't print tokens or raw JSON.

AI Karaoke Video Maker Free — Create Lyric-Synced Karaoke Videos

Name: Ai Karaoke Video Maker Free
Author: susan4731-wilfordf

Send me your audio or video files and describe the result you want. The AI karaoke video creation runs on remote GPU nodes — nothing to install on your machine.

A quick example: upload a 3-minute MP3 song with lyrics, type "sync lyrics to the beat and display them as bouncing karaoke text on screen", and you'll get a 1080p MP4 back in roughly 1-2 minutes. All rendering happens server-side.

Worth noting: shorter songs under 3 minutes process and sync more accurately.

Matching Input to Actions

User prompts referencing ai karaoke video maker free, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says...	Action	Skip SSE?
"export" / "导出" / "download" / "send me the video"	→ §3.5 Export	✅
"credits" / "积分" / "balance" / "余额"	→ §3.3 Credits	✅
"status" / "状态" / "show tracks"	→ §3.4 State	✅
"upload" / "上传" / user sends file	→ §3.2 Upload	✅
Everything else (generate, edit, add BGM…)	→ §3.1 SSE	❌

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

Skill attribution — read from this file's YAML frontmatter at runtime:

X-Skill-Source: ai-karaoke-video-maker-free
X-Skill-Version: from frontmatter version
X-Skill-Platform: detect from install path (~/.clawhub/ → clawhub, ~/.cursor/skills/ → cursor, else unknown)

Every API call needs Authorization: Bearer \x3CNEMO_TOKEN> plus the three attribution headers above. If any header is missing, exports return 402.

API base: https://mega-api-prod.nemovideo.ai

Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"\x3Clang>"} — returns task_id, session_id.

Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"\x3Csid>","new_message":{"parts":[{"text":"\x3Cmsg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.

Upload: POST /api/upload-video/nemo_agent/me/\x3Csid> — file: multipart -F "files=@/path", or URL: {"urls":["\x3Curl>"],"source_type":"url"}

Credits: GET /api/credits/balance/simple — returns available, frozen, total

Session state: GET /api/state/nemo_agent/me/\x3Csid>/latest — key fields: data.state.draft, data.state.video_infos, data.state.generated_media

Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_\x3Cts>","sessionId":"\x3Csid>","draft":\x3Cjson>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/\x3Cid> every 30s until status = completed. Download URL at output.url.

Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

Reading the SSE Stream

Text events go straight to the user (after GUI translation). Tool calls stay internal. Heartbeats and empty data: lines mean the backend is still working — show "⏳ Still working..." every 2 minutes.

About 30% of edit operations close the stream without any text. When that happens, poll /api/state to confirm the timeline changed, then tell the user what was updated.

Backend Response Translation

The backend assumes a GUI exists. Translate these into API actions:

Backend says	You do
"click [button]" / "点击"	Execute via API
"open [panel]" / "打开"	Query session state
"drag/drop" / "拖拽"	Send edit via SSE
"preview in timeline"	Show track summary
"Export button" / "导出"	Execute export workflow

Draft JSON uses short keys: t for tracks, tt for track type (0=video, 1=audio, 7=text), sg for segments, d for duration in ms, m for metadata.

Example timeline summary:

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Error Codes

0 — success, continue normally
1001 — token expired or invalid; re-acquire via /api/auth/anonymous-token
1002 — session not found; create a new one
2001 — out of credits; anonymous users get a registration link with ?bind=\x3Cid>, registered users top up
4001 — unsupported file type; show accepted formats
4002 — file too large; suggest compressing or trimming
400 — missing X-Client-Id; generate one and retry
402 — free plan export blocked; not a credit issue, subscription tier
429 — rate limited; wait 30s and retry once

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "sync lyrics to the beat and display them as bouncing karaoke text on screen" — concrete instructions get better results.

Max file size is 200MB. Stick to MP3, MP4, WAV, MOV for the smoothest experience.

Export as MP4 for widest compatibility across YouTube, TikTok, and karaoke apps.

Common Workflows

Quick edit: Upload → "sync lyrics to the beat and display them as bouncing karaoke text on screen" → Download MP4. Takes 1-2 minutes for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

Usage Guidance

This skill appears to implement a cloud-based karaoke video service and needs a NEMO_TOKEN to upload your media and create videos. Before installing or using it: 1) Verify the service domain and publisher (there is no homepage and the owner is unknown). 2) Understand that your audio/video files will be uploaded to mega-api-prod.nemovideo.ai — do not send sensitive or private media unless you trust that endpoint and its privacy policy. 3) Clarify the config-path discrepancy: SKILL.md mentions ~/.config/nemovideo/ and detecting install paths (reading ~/.clawhub/, ~/.cursor/skills/) — ask the publisher why filesystem probing is needed and what exact paths will be read. 4) If possible, use the anonymous-token flow (throwaway token) instead of a long-lived personal token, and confirm token scope and expiry. 5) Ask the publisher for a homepage, privacy policy, and contact info; absence of these lowers trust. Providing those items or publishing the skill code would increase confidence.

Capability Analysis

Type: OpenClaw Skill Name: ai-karaoke-video-maker-free Version: 1.0.0 The skill facilitates cloud-based video processing by interacting with an external API (mega-api-prod.nemovideo.ai). In SKILL.md, it instructs the agent to upload user files to this remote server, access the NEMO_TOKEN environment variable, and perform platform detection by checking local directory paths (e.g., ~/.clawhub/). While these behaviors are aligned with the stated purpose of creating karaoke videos, the combination of remote data transmission, credential handling, and filesystem discovery represents high-risk capabilities that warrant a suspicious classification under the provided criteria.

Capability Assessment

ℹ Purpose & Capability

The skill's name and description (karaoke lyric video creation) align with the API calls and workflows described in SKILL.md (upload, render, export). Requiring a service token (NEMO_TOKEN) is proportional. However, the SKILL.md frontmatter declares a config path (~/.config/nemovideo/) while the registry metadata provided earlier listed no required config paths — that inconsistency should be clarified by the publisher.

⚠ Instruction Scope

The instructions direct the agent to upload user audio/video to an external domain (mega-api-prod.nemovideo.ai) and to persist/use a session token and session_id — behavior consistent with a cloud render service but privacy-sensitive. The SKILL.md also instructs the agent to read this file's YAML frontmatter and detect the install path (e.g., checking ~/.clawhub/ or ~/.cursor/skills/), which requires probing filesystem locations in the user's home directory. Reading install-paths to set an attribution header is plausible but is an extra, potentially sensitive filesystem access that should be explicitly justified.

✓ Install Mechanism

This is an instruction-only skill with no install spec or bundled code, so nothing will be written to disk by an installer. That minimizes installation risk.

ℹ Credentials

The skill requires one credential (NEMO_TOKEN) and uses it for API calls — expected for a third-party cloud rendering service. It also documents an anonymous-token endpoint to obtain a temporary token, which is reasonable for a free/guest flow. Still, the token grants access to upload media and start render jobs, so users should ensure the token is issued by a trusted service and avoid reusing sensitive account tokens. The earlier registry metadata's omission of config paths vs SKILL.md's metadata that references a config path is a discrepancy to resolve.

✓ Persistence & Privilege

always is false and the skill does not request system-wide or perpetual privileges. It instructs saving session_id and using the token for API calls, which is normal for a web service integration and does not itself indicate excessive privilege.

Version History

v1.0.0

Initial release of AI Karaoke Video Maker Free — create lyric-synced karaoke videos easily. - Upload audio or video files (MP3, MP4, WAV, MOV, up to 200MB) and generate synced karaoke videos with AI-powered lyric display. - No manual editing required; describe your desired karaoke style (e.g., "bouncing karaoke text") for automatic rendering. - Exports 1080p MP4 videos suitable for YouTube, TikTok, and karaoke apps. - Built-in free token system with 100 credits and a 7-day expiry for new users; no registration needed. - Multi-language support for user commands and backend responses, with smart error handling and workflow tips. - Cloud-based rendering pipeline for fast, server-side processing—no install or design skills required.

Metadata

Slug ai-karaoke-video-maker-free

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Ai Karaoke Video Maker Free?

Get karaoke lyric videos ready to post, without touching a single slider. Upload your audio or video files (MP3, MP4, WAV, MOV, up to 200MB), say something l... It is an AI Agent Skill for Claude Code / OpenClaw, with 80 downloads so far.

How do I install Ai Karaoke Video Maker Free?

Run "/install ai-karaoke-video-maker-free" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Ai Karaoke Video Maker Free free?

Yes, Ai Karaoke Video Maker Free is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Ai Karaoke Video Maker Free support?

Ai Karaoke Video Maker Free is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Ai Karaoke Video Maker Free?

It is built and maintained by susan4731-wilfordf (@susan4731-wilfordf); the current version is v1.0.0.

More Skills

Ai Karaoke Video Maker Free