Description

Turn your NuwaAI digital avatar into a singing performer! The avatar sings with lip-synced mouth movements driven by vocal audio, with synchronized backgroun...

README (SKILL.md)

🎤 Digital Singer — NuwaAI Avatar Singing

Name: Digital Singer
Author: jianglingling007

Turn your NuwaAI digital avatar into a singing performer with lip-synced mouth movements.

How It Works

Audio files → FFmpeg PCM conversion → NuwaAI humanctrl avatar messages → Avatar sings with lip-sync
                                    + Browser \x3Caudio> plays accompaniment in sync

Key concept: Two audio streams work together:

speech (vocal-only): Drives the avatar's mouth movements via NuwaAI avatar control messages
music (accompaniment): Plays simultaneously via browser \x3Caudio> element

Battle Flow

Avatar greets user, lists available songs
User picks a song
Avatar sings the upper half (vocal drives lip-sync + accompaniment plays in sync)
Avatar says "your turn" → accompaniment for lower half plays → user sings along (ASR captures voice)
Battle scoring + blind box reward
Ask to continue

User Preparation (What You Need Before Using)

1. NuwaAI Account (Required)

Sign up at nuwaai.com (free) and create a digital avatar. You need:

API Key — from your NuwaAI dashboard
Avatar ID — the avatar you want to sing
User ID — your NuwaAI user ID

Enter these in the browser interface, or pre-configure in .nuwa-config.json:

{
  "avatarId": "your-avatar-id",
  "apiKey": "your-api-key",
  "userId": "your-user-id"
}

2. LLM API (Required)

The singing host (conversation agent) needs an OpenAI-compatible LLM API. Configure in server.mjs or via environment variables:

DASHSCOPE_BASE_URL — API endpoint (default: Dashscope)
DASHSCOPE_API_KEY — API key
QWEN_MODEL — Model name (default: qwen-plus)

3. Song Audio Files (Required)

Each song needs 3 audio files placed in the skill directory:

File	Purpose	Example
`{song}高潮上清唱.wav`	Upper half vocal (a cappella) — drives avatar lip-sync	`十年高潮上清唱.wav`
`{song}高潮上伴奏.MP3`	Upper half accompaniment — plays in sync with avatar	`十年高潮上伴奏.MP3`
`{song}高潮下伴奏.MP3`	Lower half accompaniment — plays when user sings	`十年高潮下伴奏.MP3`

How to prepare these files:

Use any audio editing tool (e.g. Audacity, Adobe Audition) to split songs into upper/lower halves
Use vocal separation tools (e.g. UVR, Demucs) to extract a cappella (vocal-only) from the upper half
Export accompaniment as MP3, vocal as WAV (any FFmpeg-supported format works)
Place all files in the skill directory (same folder as server.mjs)

4. Song Registry (Required)

After preparing audio files, register each song in server.mjs SONGS object:

const SONGS = {
  "十年": {
    artist: "陈奕迅",
    acappella_upper: "十年高潮上清唱.wav",
    accomp_upper: "十年高潮上伴奏.MP3",
    accomp_lower: "十年高潮下伴奏.MP3",
  },
  // Add more songs...
};

5. FFmpeg (Required)

Install FFmpeg for audio format conversion:

# macOS
brew install ffmpeg
# Ubuntu/Debian
sudo apt install ffmpeg

6. Node.js 18+ (Required)

node --version  # must be >= 18

Quick Start

Complete all preparation steps above
Copy example song files from assets/songs/ to the skill directory:
```
cp \x3Cskill-dir>/assets/songs/* \x3Cskill-dir>/
```
This includes a ready-to-use demo song: 十年 (陈奕迅) with vocal, upper and lower accompaniment files.
Start the server:
```
node \x3Cskill-dir>/server.mjs
```
Open http://localhost:3098 in browser
Enter NuwaAI credentials (if not pre-configured)
Pick a song and start singing!

Included Example Song

The skill ships with one demo song in assets/songs/:

十年高潮上清唱.wav — vocal (a cappella)
十年高潮上伴奏.MP3 — upper half accompaniment
十年高潮下伴奏.MP3 — lower half accompaniment

Copy them to the skill root directory to use. The song "十年" is pre-registered in server.mjs.

NuwaAI Integration

Uses humanctrl WebSocket with ASR enabled. Avatar control message format:

{
  "type": "avatar",
  "data": {
    "content": "",
    "audio": {
      "segment": 0,
      "speech": "\x3Cbase64 PCM>",
      "music": "\x3Cbase64 PCM>"
    }
  }
}

speech: Vocal-only PCM driving lip-sync (≤10KB per chunk)
music: Same as speech for a cappella mode
segment: Chunk index, -1 = last chunk
Audio: 16kHz mono PCM, base64 encoded

Features

🎤 Avatar lip-sync singing via NuwaAI humanctrl
🎵 Synchronized accompaniment playback
🗣️ ASR voice capture for user singing
🎯 Fun battle scoring system
🎁 Blind box rewards
⏸️ Interruptible speech (not during singing)
📱 Responsive web interface

Requirements

Node.js 18+
FFmpeg (for audio → PCM conversion)
NuwaAI account with avatar
Modern browser (WebRTC + microphone)

Usage Guidance

What to consider before installing: - Do not run the code until you review/remove embedded keys: the package contains hard-coded API keys (in chorus_agent.py and a demo key in server.mjs). These keys are unexpected; using them could expose your usage to a third party. If you or your org ever used those keys, rotate them immediately. - Clarify which component you intend to run: SKILL.md and Quick Start describe running the Node server (server.mjs) and a browser UI. There is also a separate Python agent (chorus_agent.py) that is not mentioned in the README and behaves differently (plays audio via afplay, uses a different songs folder, calls the LLM service directly). If you only want the Node web UI, do not run the Python agent. - Remove or replace hard-coded credentials and use environment variables: server.mjs already supports env vars for DASHSCOPE_*, QWEN_MODEL and stores Nuwa credentials in .nuwa-config.json. Replace any embedded keys with your own keys stored in environment variables or local config files and never commit secrets to code. - Inspect network calls and endpoints: the code talks to Dashscope-compatible endpoints (https://dashscope.aliyuncs.com/...) and expects NuwaAI humanctrl WebSocket. Confirm these endpoints are what you expect and that no unexpected remote hosts are contacted. - Run in an isolated environment: if you decide to test, run the code in a sandbox or container, do not expose any sensitive credentials, and avoid running chorus_agent.py unless you fully audit it. Check for platform-specific commands (afplay) and remove/modify them for your OS. - File placement mismatch: make sure song files are placed where the server actually expects them (SKILL.md says skill root; python agent expects a '清唱' directory). Harmonize filenames/paths before use. - If you lack confidence in the origin (homepage unknown, source unknown), prefer not to run it on production machines. Ask the publisher/maintainer for clarification about the Python agent, the embedded keys, and confirm which files are required.

Capability Analysis

Type: OpenClaw Skill Name: digital-singer Version: 1.3.0 The skill contains a critical Remote Code Execution (RCE) vulnerability in `server.mjs` where the `/api/song/pcm` endpoint passes unsanitized user input directly into a shell command via `execSync` and `ffmpeg`. Additionally, multiple files (`server.mjs` and `chorus_agent.py`) contain hardcoded API keys for DashScope and NuwaAI, which is a significant security risk. While these appear to be unintentional security flaws rather than deliberate malware, the lack of input validation and the exposure of sensitive credentials make the bundle highly risky.

Capability Tags

requires-sensitive-credentials

Capability Assessment

⚠ Purpose & Capability

The SKILL.md and server.mjs describe a Node-based web server (server.mjs) that drives a browser front-end and asks the user for NuwaAI credentials and an LLM API. However the package also includes an unrelated-sounding Python agent (chorus_agent.py) with its own hard-coded Dashscope API key and different song directory conventions. The presence of a Python agent and baked-in credentials is not justified by the Node-centric Quick Start in SKILL.md and appears incoherent.

⚠ Instruction Scope

SKILL.md instructs users to run node server.mjs and to place song files in the skill root; it does not mention running chorus_agent.py. chorus_agent.py on the other hand expects audio files in a '清唱' folder, calls out to Dashscope using an embedded API key, and runs local playback via afplay (macOS). The instructions do not mention these behaviors, so runtime behavior could differ from what users expect.

ℹ Install Mechanism

There is no install spec (instruction-only), which limits automatic installation risk. The package.json declares one dependency (ws) which is reasonable for a WebSocket front-end. Node.js and ffmpeg are required per SKILL.md — this is proportionate to the media-processing use case. No remote downloads or extract-from-URL installs are present.

⚠ Credentials

SKILL.md asks for NuwaAI API key and LLM configuration (DASHSCOPE_BASE_URL, DASHSCOPE_API_KEY). Those are reasonable. But the codebase contains hard-coded API keys (see chorus_agent.py and server.mjs DEMO_CONFIG) embedded in files rather than declared as required env variables; this both leaks secrets and suggests the shipped code may call external LLM endpoints without the user's explicit configuration. The skill also does not declare required env vars in metadata despite needing LLM and NuwaAPI credentials.

✓ Persistence & Privilege

always is false and there is no indication the skill force-enables itself. The server writes a local .nuwa-config.json to persist user-entered NuwaAI credentials which is consistent with the described UI behavior. No other system-wide or cross-skill modifications are present.

Version History

v1.3.0

Removed private deployment URLs and API keys from defaults. Sanitized .nuwa-config.json to use placeholders.

v1.2.0

Added example song (十年 by 陈奕迅) in assets/songs/ so users can try immediately

v1.1.0

Improved user preparation docs: clear step-by-step guide for NuwaAI account, LLM API, song audio files, song registry, FFmpeg and Node.js requirements

v1.0.0

Initial release: NuwaAI avatar singing with lip-sync, duet battle mode, accompaniment sync, ASR voice capture, scoring and blind box rewards

Metadata

Slug digital-singer

Version 1.3.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 4

Frequently Asked Questions

What is Digital Singer?

Turn your NuwaAI digital avatar into a singing performer! The avatar sings with lip-synced mouth movements driven by vocal audio, with synchronized backgroun... It is an AI Agent Skill for Claude Code / OpenClaw, with 75 downloads so far.

How do I install Digital Singer?

Run "/install digital-singer" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Digital Singer free?

Yes, Digital Singer is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Digital Singer support?

Digital Singer is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Digital Singer?

It is built and maintained by jianglingling007 (@jianglingling007); the current version is v1.3.0.

More Skills

Digital Singer