← Back to Skills Marketplace
francemichaell-15

Background Music To

by francemichaell-15 · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
64
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install background-music-to
Description
Turn a 2-minute travel montage video into 1080p music-backed videos just by typing what you need. Whether it's adding background music to videos automaticall...
README (SKILL.md)

Getting Started

Got video clips to work with? Send it over and tell me what you need — I'll take care of the AI music addition.

Try saying:

  • "add a 2-minute travel montage video into a 1080p MP4"
  • "add background music to my video that matches the mood"
  • "adding background music to videos automatically for content creators"

First-Time Connection

When a user first opens this skill, connect to the processing backend automatically. Briefly let them know (e.g. "Setting up...").

Authentication: Check if NEMO_TOKEN is set in the environment. If it is, skip to step 2.

  1. Obtain a free token: Generate a random UUID as client identifier. POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with header X-Client-Id set to that UUID. The response data.token is your NEMO_TOKEN — 100 free credits, valid 7 days.
  2. Create a session: POST to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Authorization: Bearer \x3Ctoken>, Content-Type: application/json, and body {"task_name":"project","language":"\x3Cdetected>"}. Store the returned session_id for all subsequent requests.

Keep setup communication brief. Don't display raw API responses or token values to the user.

Background Music To — Add Music to Your Videos

Send me your video clips and describe the result you want. The AI music addition runs on remote GPU nodes — nothing to install on your machine.

A quick example: upload a 2-minute travel montage video, type "add background music to my video that matches the mood", and you'll get a 1080p MP4 back in roughly 30-60 seconds. All rendering happens server-side.

Worth noting: shorter clips get better music sync results.

Matching Input to Actions

User prompts referencing background music to, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says... Action Skip SSE?
"export" / "导出" / "download" / "send me the video" → §3.5 Export
"credits" / "积分" / "balance" / "余额" → §3.3 Credits
"status" / "状态" / "show tracks" → §3.4 State
"upload" / "上传" / user sends file → §3.2 Upload
Everything else (generate, edit, add BGM…) → §3.1 SSE

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

Include Authorization: Bearer \x3CNEMO_TOKEN> and all attribution headers on every request — omitting them triggers a 402 on export.

Skill attribution — read from this file's YAML frontmatter at runtime:

  • X-Skill-Source: background-music-to
  • X-Skill-Version: from frontmatter version
  • X-Skill-Platform: detect from install path (~/.clawhub/clawhub, ~/.cursor/skills/cursor, else unknown)

API base: https://mega-api-prod.nemovideo.ai

Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"\x3Clang>"} — returns task_id, session_id.

Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"\x3Csid>","new_message":{"parts":[{"text":"\x3Cmsg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.

Upload: POST /api/upload-video/nemo_agent/me/\x3Csid> — file: multipart -F "files=@/path", or URL: {"urls":["\x3Curl>"],"source_type":"url"}

Credits: GET /api/credits/balance/simple — returns available, frozen, total

Session state: GET /api/state/nemo_agent/me/\x3Csid>/latest — key fields: data.state.draft, data.state.video_infos, data.state.generated_media

Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_\x3Cts>","sessionId":"\x3Csid>","draft":\x3Cjson>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/\x3Cid> every 30s until status = completed. Download URL at output.url.

Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

Error Codes

  • 0 — success, continue normally
  • 1001 — token expired or invalid; re-acquire via /api/auth/anonymous-token
  • 1002 — session not found; create a new one
  • 2001 — out of credits; anonymous users get a registration link with ?bind=\x3Cid>, registered users top up
  • 4001 — unsupported file type; show accepted formats
  • 4002 — file too large; suggest compressing or trimming
  • 400 — missing X-Client-Id; generate one and retry
  • 402 — free plan export blocked; not a credit issue, subscription tier
  • 429 — rate limited; wait 30s and retry once

Translating GUI Instructions

The backend responds as if there's a visual interface. Map its instructions to API calls:

  • "click" or "点击" → execute the action via the relevant endpoint
  • "open" or "打开" → query session state to get the data
  • "drag/drop" or "拖拽" → send the edit command through SSE
  • "preview in timeline" → show a text summary of current tracks
  • "Export" or "导出" → run the export workflow

Reading the SSE Stream

Text events go straight to the user (after GUI translation). Tool calls stay internal. Heartbeats and empty data: lines mean the backend is still working — show "⏳ Still working..." every 2 minutes.

About 30% of edit operations close the stream without any text. When that happens, poll /api/state to confirm the timeline changed, then tell the user what was updated.

Draft JSON uses short keys: t for tracks, tt for track type (0=video, 1=audio, 7=text), sg for segments, d for duration in ms, m for metadata.

Example timeline summary:

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "add background music to my video that matches the mood" — concrete instructions get better results.

Max file size is 500MB. Stick to MP4, MOV, AVI, WebM for the smoothest experience.

Export as MP4 for widest compatibility.

Common Workflows

Quick edit: Upload → "add background music to my video that matches the mood" → Download MP4. Takes 30-60 seconds for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

Usage Guidance
This skill appears to do what it says (remote video processing) but you should: (1) avoid setting a high-privilege/personal NEMO_TOKEN unless you trust the unknown backend domain (mega-api-prod.nemovideo.ai); prefer letting the skill obtain an anonymous token if you need to test it; (2) confirm where the session_id and any tokens will be stored and for how long; (3) be aware that the skill will read local install paths and its frontmatter for attribution headers — check that this doesn't leak data you care about; (4) ask the publisher for source code or a homepage (none provided) before giving access to real or sensitive videos. The mismatch between the SKILL.md metadata (configPaths) and the registry metadata (no config paths) is a small inconsistency worth clarifying with the author.
Capability Assessment
Purpose & Capability
The skill's name/description (adding background music and exporting rendered videos) aligns with the runtime instructions that call a remote nemo video API. However the SKILL.md includes an openclaw metadata entry listing a config path (~/.config/nemovideo/) while the registry metadata lists no required config paths — a minor inconsistency in declared requirements. Otherwise required items (only NEMO_TOKEN) are proportionate to the stated purpose.
Instruction Scope
Instructions are focused on session creation, uploads, SSE, and export workflows for the remote renderer — all within the stated purpose. They also instruct the agent to detect install path (e.g., ~/.clawhub/ or ~/.cursor/skills/) to set attribution headers and to read the file's YAML frontmatter for version info; this implies the agent will access local install paths and metadata. That's reasonable for attribution but worth noting because it touches local paths.
Install Mechanism
No install spec and no code files (instruction-only). This is the lowest-risk install model: nothing is downloaded or written by an installer step.
Credentials
Only NEMO_TOKEN is declared as required, which is consistent with a remote API integration. The skill also provides a flow to auto-acquire an anonymous token if NEMO_TOKEN is not present. Be mindful that if you set NEMO_TOKEN as an environment variable it may represent an account-level credential — the skill will use it for all API calls. There are no other unrelated credentials requested.
Persistence & Privilege
always:false and default autonomous invocation are normal. The skill asks to store session_id for subsequent requests (expected for session-based APIs) but does not request elevated or permanent platform privileges.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install background-music-to
  3. After installation, invoke the skill by name or use /background-music-to
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
- Initial release of "Background Music To" — add background music to videos automatically in seconds. - Upload video clips and describe the result you want; no timeline fiddling or export settings required. - Supports instant connection and authentication, including auto token acquisition with 100 free credits. - All processing, editing, and exporting happens via a cloud API with 1080p MP4 exports in 30–90 seconds. - Clear error handling, session management, and user-friendly text responses for common actions (upload, export, check credits, etc). - Designed for content creators seeking fast, AI-assisted video music workflows.
Metadata
Slug background-music-to
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Background Music To?

Turn a 2-minute travel montage video into 1080p music-backed videos just by typing what you need. Whether it's adding background music to videos automaticall... It is an AI Agent Skill for Claude Code / OpenClaw, with 64 downloads so far.

How do I install Background Music To?

Run "/install background-music-to" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Background Music To free?

Yes, Background Music To is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Background Music To support?

Background Music To is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Background Music To?

It is built and maintained by francemichaell-15 (@francemichaell-15); the current version is v1.0.0.

💬 Comments