← 返回 Skills 市场
whitejohnk-26

Ai Music Maker

作者 whitejohnk-26 · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ 安全检测通过
57
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install ai-music-maker
功能描述
Get music-scored videos ready to post, without touching a single slider. Upload your video clips (MP4, MOV, AVI, WebM, up to 500MB), say something like "add...
使用说明 (SKILL.md)

Getting Started

Send me your video clips and I'll handle the AI music generation. Or just describe what you're after.

Try saying:

  • "generate a 60-second travel montage video into a 1080p MP4"
  • "add an upbeat background track that matches the video mood and tempo"
  • "adding AI-generated background music to videos for content creators"

Automatic Setup

On first interaction, connect to the processing API before doing anything else. Show a brief status like "Setting things up...".

Token: If NEMO_TOKEN environment variable is already set, use it and skip to Session below.

Free token: Generate a UUID as client identifier, then POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with header X-Client-Id: \x3Cuuid>. The response field data.token becomes your NEMO_TOKEN (100 credits, 7-day expiry).

Session: POST to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer auth and body {"task_name":"project"}. Save session_id from the response.

Confirm to the user you're connected and ready. Don't print tokens or raw JSON.

AI Music Maker — Generate Music for Your Videos

Send me your video clips and describe the result you want. The AI music generation runs on remote GPU nodes — nothing to install on your machine.

A quick example: upload a 60-second travel montage video, type "add an upbeat background track that matches the video mood and tempo", and you'll get a 1080p MP4 back in roughly 30-60 seconds. All rendering happens server-side.

Worth noting: shorter videos allow the AI to better match music energy to scene changes.

Matching Input to Actions

User prompts referencing ai music maker, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says... Action Skip SSE?
"export" / "导出" / "download" / "send me the video" → §3.5 Export
"credits" / "积分" / "balance" / "余额" → §3.3 Credits
"status" / "状态" / "show tracks" → §3.4 State
"upload" / "上传" / user sends file → §3.2 Upload
Everything else (generate, edit, add BGM…) → §3.1 SSE

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

All calls go to https://mega-api-prod.nemovideo.ai. The main endpoints:

  1. SessionPOST /api/tasks/me/with-session/nemo_agent with {"task_name":"project","language":"\x3Clang>"}. Gives you a session_id.
  2. Chat (SSE)POST /run_sse with session_id and your message in new_message.parts[0].text. Set Accept: text/event-stream. Up to 15 min.
  3. UploadPOST /api/upload-video/nemo_agent/me/\x3Csid> — multipart file or JSON with URLs.
  4. CreditsGET /api/credits/balance/simple — returns available, frozen, total.
  5. StateGET /api/state/nemo_agent/me/\x3Csid>/latest — current draft and media info.
  6. ExportPOST /api/render/proxy/lambda with render ID and draft JSON. Poll GET /api/render/proxy/lambda/\x3Cid> every 30s for completed status and download URL.

Formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

Headers are derived from this file's YAML frontmatter. X-Skill-Source is ai-music-maker, X-Skill-Version comes from the version field, and X-Skill-Platform is detected from the install path (~/.clawhub/ = clawhub, ~/.cursor/skills/ = cursor, otherwise unknown).

All requests must include: Authorization: Bearer \x3CNEMO_TOKEN>, X-Skill-Source, X-Skill-Version, X-Skill-Platform. Missing attribution headers will cause export to fail with 402.

Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Translating GUI Instructions

The backend responds as if there's a visual interface. Map its instructions to API calls:

  • "click" or "点击" → execute the action via the relevant endpoint
  • "open" or "打开" → query session state to get the data
  • "drag/drop" or "拖拽" → send the edit command through SSE
  • "preview in timeline" → show a text summary of current tracks
  • "Export" or "导出" → run the export workflow

SSE Event Handling

Event Action
Text response Apply GUI translation (§4), present to user
Tool call/result Process internally, don't forward
heartbeat / empty data: Keep waiting. Every 2 min: "⏳ Still working..."
Stream closes Process final response

~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.

Error Codes

  • 0 — success, continue normally
  • 1001 — token expired or invalid; re-acquire via /api/auth/anonymous-token
  • 1002 — session not found; create a new one
  • 2001 — out of credits; anonymous users get a registration link with ?bind=\x3Cid>, registered users top up
  • 4001 — unsupported file type; show accepted formats
  • 4002 — file too large; suggest compressing or trimming
  • 400 — missing X-Client-Id; generate one and retry
  • 402 — free plan export blocked; not a credit issue, subscription tier
  • 429 — rate limited; wait 30s and retry once

Common Workflows

Quick edit: Upload → "add an upbeat background track that matches the video mood and tempo" → Download MP4. Takes 30-60 seconds for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "add an upbeat background track that matches the video mood and tempo" — concrete instructions get better results.

Max file size is 500MB. Stick to MP4, MOV, AVI, WebM for the smoothest experience.

Export as MP4 for widest compatibility.

安全使用建议
This skill appears to do what it says: it uploads your videos to a remote service to generate background music and deliver rendered MP4s. Before installing or using it, confirm you trust the remote domain (mega-api-prod.nemovideo.ai) and are comfortable uploading your videos (they may contain sensitive or personally identifiable content). Because the skill can obtain an anonymous NEMO_TOKEN and will send files externally, avoid sending confidential footage until you've verified the provider's privacy policy and reputation. Do not provide other unrelated credentials, and keep any returned tokens secret. If you need stronger guarantees, ask the publisher for a homepage, privacy policy, or service documentation before proceeding.
功能分析
Type: OpenClaw Skill Name: ai-music-maker Version: 1.0.0 The skill provides instructions for an AI agent to interface with a cloud-based video processing service at mega-api-prod.nemovideo.ai. It outlines standard API workflows for authentication, session management, file uploads, and media rendering. The use of the NEMO_TOKEN environment variable and the automated acquisition of anonymous tokens are consistent with the stated purpose of the tool, and there is no evidence of data exfiltration, malicious command execution, or deceptive prompt injection in SKILL.md.
能力评估
Purpose & Capability
Name/description (AI music for videos) align with the declared runtime behavior: uploading user video files and requesting renders from mega-api-prod.nemovideo.ai. The single required env var (NEMO_TOKEN) and the optional config path (~/.config/nemovideo/) are coherent with a remote media-processing service. No unrelated cloud providers or broad system credentials are requested.
Instruction Scope
SKILL.md instructs the agent to upload user videos and to obtain or use a NEMO_TOKEN, create sessions, poll render status, and return download URLs — all expected. It does not instruct reading unrelated local files or harvesting unrelated environment variables. Minor note: it uses install-path detection to set an X-Skill-Platform header (reads/detects install path), which is a small additional system probe but consistent with adding attribution headers.
Install Mechanism
There is no install spec and no code files — this is instruction-only, so nothing will be written to disk or fetched at install time. That is the lowest-risk install model.
Credentials
Only one credential (NEMO_TOKEN) is required and it's the service token the skill uses to authenticate with the nemovideo API. The SKILL.md describes how to obtain an anonymous token if none is present, which fits the stated workflow. No unrelated secrets or platform credentials are requested.
Persistence & Privilege
The skill is not forced-always; it does not request elevated or permanent platform privileges. It asks to save a session_id and to use a token for API calls, which is normal for a remote-rendering workflow. Metadata lists a config path (~/.config/nemovideo/) that the skill may expect; SKILL.md does not direct writing system-wide configs or modifying other skills.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install ai-music-maker
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /ai-music-maker 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
AI Music Maker 1.0.0 — Generate music for videos with a simple prompt and fast cloud rendering. - Launches AI-powered music generation for video content creators, with a streamlined workflow. - Supports video uploads up to 500MB (MP4, MOV, AVI, WebM); generates royalty-free background music tailored to video mood and tempo. - Automatic setup includes token management and session initiation. - Exports 1080p MP4s, with cloud GPU processing and quick turnaround (30-90s typical). - User-friendly: just upload a video and describe the desired music, no manual editing required. - Integrated workflow for upload, edit, preview, export, and credit management.
元数据
Slug ai-music-maker
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

Ai Music Maker 是什么?

Get music-scored videos ready to post, without touching a single slider. Upload your video clips (MP4, MOV, AVI, WebM, up to 500MB), say something like "add... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 57 次。

如何安装 Ai Music Maker?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install ai-music-maker」即可一键安装,无需额外配置。

Ai Music Maker 是免费的吗?

是的,Ai Music Maker 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Ai Music Maker 支持哪些平台?

Ai Music Maker 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Ai Music Maker?

由 whitejohnk-26(@whitejohnk-26)开发并维护,当前版本 v1.0.0。

💬 留言讨论