/install lipsyncvideo-ai
Getting Started
LipSync Video AI is ready. Upload your video and audio, or describe what you need synced.
Try saying:
- "sync this voiceover to the speaker"
- "replace the audio and match lip movements"
- "dub this clip with my recording"
Initial Setup
First time running this, it connects to the processing backend. Shows a quick "Getting ready..." message.
Token: Check for NEMO_TOKEN in environment. If present, go straight to session setup.
- Grab a free token: Generate a UUID client identifier. POST to
https://mega-api-prod.nemovideo.ai/api/auth/anonymous-tokenusingX-Client-Idheader with your UUID. Responsedata.tokenis your auth token (100 credits, good for 7 days). - Start session: POST to
https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent, Bearer auth, body:{"task_name":"project","language":"\x3Clang>"}. Save thesession_idfor later calls.
Raw JSON and tokens stay hidden from the user.
Sync Audio to Lip Movements in Your Clips
Upload your video with the audio you want synced. Cloud GPUs do the heavy lifting — no local processing.
Here is how it works in practice: had a training video where the speaker's mic died halfway through. Recorded a clean voiceover separately, uploaded both files, typed "sync the new audio to match the speaker's mouth movements" and got a clean result in about 75 seconds. Output is 1080p MP4.
Pro tip: shorter clips give tighter sync. If you have a long video, consider breaking it into segments first.
Request Categories
Your input gets matched to the right processing path automatically.
| You type... | Goes to... | Uses SSE? |
|---|---|---|
| "export" / "download" / "get video" / "导出" | Export pipeline | No |
| "credits" / "balance" / "remaining" / "积分" | Balance check | No |
| "status" / "show me the tracks" / "状态" | Session state | No |
| "upload" / attached file / "上传" | File ingestion | No |
| Anything else (sync, dub, match, adjust...) | SSE processing | Yes |
Backend Architecture
Files go to a GPU farm for processing. Output is encoded at 8Mbps for 1080p. Lip sync boundaries are frame-level accurate.
Required on every request: Authorization: Bearer \x3CNEMO_TOKEN> and attribution headers X-Skill-Source, X-Skill-Version, X-Skill-Platform. Missing attribution means export fails with 402.
Attribution comes from this file's YAML: X-Skill-Source is lipsyncvideo-ai, X-Skill-Version is whatever version is in frontmatter, X-Skill-Platform depends on install location (~/.clawhub/ → clawhub, ~/.cursor/skills/ → cursor, otherwise unknown).
Root URL: https://mega-api-prod.nemovideo.ai
New session: POST /api/tasks/me/with-session/nemo_agent with {"task_name":"project","language":"\x3Clang>"}. Returns task_id, session_id.
SSE message: POST /run_sse with {"app_name":"nemo_agent","user_id":"me","session_id":"\x3Csid>","new_message":{"parts":[{"text":"\x3Cmsg>"}]}} and Accept: text/event-stream. Cap: 15 min.
File upload: POST /api/upload-video/nemo_agent/me/\x3Csid> — multipart (-F "files=@/path") or URL mode ({"urls":["\x3Curl>"],"source_type":"url"}).
Balance: GET /api/credits/balance/simple returns available, frozen, total.
State: GET /api/state/nemo_agent/me/\x3Csid>/latest — check data.state.draft, data.state.video_infos, data.state.generated_media.
Export (free): POST /api/render/proxy/lambda with {"id":"render_\x3Cts>","sessionId":"\x3Csid>","draft":\x3Cjson>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/\x3Cid> every 30s. Done when status = completed. File at output.url.
Handles: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.
Errors
| Code | Means | Fix |
|---|---|---|
| 0 | Success | Continue |
| 1001 | Bad token | Re-authenticate via anonymous-token endpoint |
| 1002 | No session | Make a new one |
| 2001 | No credits left | Anonymous: share registration link with ?bind=\x3Cid>. Others: top up |
| 4001 | Can't handle that file type | Share supported formats |
| 4002 | Too large | Suggest trimming or compressing |
| 400 | Missing X-Client-Id | Generate and retry |
| 402 | Free plan export limit | Needs registration or upgrade |
| 429 | Rate capped | Wait 30s, try again once |
Converting GUI Instructions
Backend outputs reference a visual interface. Convert them:
| Backend output | Your action |
|---|---|
| "click [X]" / "点击" | Invoke the API equivalent |
| "open [panel]" / "打开" | Read session state |
| "drag/drop" / "拖拽" | Post edit through SSE |
| "preview in timeline" | Output track listing |
| "Export button" / "导出" | Start export sequence |
How SSE Works
Forward text events to user (after GUI translation). Absorb tool calls. Heartbeat and empty data lines = still processing. Every 2 minutes of quiet, say "Hang on, still processing..."
About 30% of edit ops return no text. If the stream closes empty, check state to confirm the edit stuck, then tell the user.
Draft keys: t (tracks), tt (track type: 0=video, 1=audio, 7=text), sg (segments), d (duration, ms), m (metadata).
Timeline (2 tracks): 1. Video: interview clip (0-45s) 2. Audio: dubbed voiceover (0-45s)
Common Workflows
Basic lip sync: Upload video + audio, ask for sync. Done.
Audio replacement: Upload new audio, tell the skill to swap it in and match the mouth movements.
Multi-speaker: Works best when speakers take turns. For overlapping speech, split into separate segments first.
FAQ
How accurate is the sync? Frame-level for clear speech. Mumbling or fast-talking may be slightly off.
What audio formats? MP3, WAV, M4A, AAC all work.
File size limit? 500MB. Compress if you're over.
Cost? First 100 operations free. No signup required.
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install lipsyncvideo-ai - 安装完成后,直接呼叫该 Skill 的名称或使用
/lipsyncvideo-ai触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
Lipsyncvideo Ai 是什么?
Match audio tracks to lip movements in your videos. lipsyncvideo-ai uploads your clip to a cloud GPU, syncs the audio you provide to the speaker's mouth, and... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 101 次。
如何安装 Lipsyncvideo Ai?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install lipsyncvideo-ai」即可一键安装,无需额外配置。
Lipsyncvideo Ai 是免费的吗?
是的,Lipsyncvideo Ai 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Lipsyncvideo Ai 支持哪些平台?
Lipsyncvideo Ai 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Lipsyncvideo Ai?
由 mory128(@mory128)开发并维护,当前版本 v1.0.1。