← 返回 Skills 市场
jianglingling007

Digital Singer

作者 jianglingling007 · GitHub ↗ · v1.3.0 · MIT-0
cross-platform ⚠ suspicious
75
总下载
0
收藏
0
当前安装
4
版本数
在 OpenClaw 中安装
/install digital-singer
功能描述
Turn your NuwaAI digital avatar into a singing performer! The avatar sings with lip-synced mouth movements driven by vocal audio, with synchronized backgroun...
使用说明 (SKILL.md)

🎤 Digital Singer — NuwaAI Avatar Singing

Turn your NuwaAI digital avatar into a singing performer with lip-synced mouth movements.

How It Works

Audio files → FFmpeg PCM conversion → NuwaAI humanctrl avatar messages → Avatar sings with lip-sync
                                    + Browser \x3Caudio> plays accompaniment in sync

Key concept: Two audio streams work together:

  • speech (vocal-only): Drives the avatar's mouth movements via NuwaAI avatar control messages
  • music (accompaniment): Plays simultaneously via browser \x3Caudio> element

Battle Flow

  1. Avatar greets user, lists available songs
  2. User picks a song
  3. Avatar sings the upper half (vocal drives lip-sync + accompaniment plays in sync)
  4. Avatar says "your turn" → accompaniment for lower half plays → user sings along (ASR captures voice)
  5. Battle scoring + blind box reward
  6. Ask to continue

User Preparation (What You Need Before Using)

1. NuwaAI Account (Required)

Sign up at nuwaai.com (free) and create a digital avatar. You need:

  • API Key — from your NuwaAI dashboard
  • Avatar ID — the avatar you want to sing
  • User ID — your NuwaAI user ID

Enter these in the browser interface, or pre-configure in .nuwa-config.json:

{
  "avatarId": "your-avatar-id",
  "apiKey": "your-api-key",
  "userId": "your-user-id"
}

2. LLM API (Required)

The singing host (conversation agent) needs an OpenAI-compatible LLM API. Configure in server.mjs or via environment variables:

  • DASHSCOPE_BASE_URL — API endpoint (default: Dashscope)
  • DASHSCOPE_API_KEY — API key
  • QWEN_MODEL — Model name (default: qwen-plus)

3. Song Audio Files (Required)

Each song needs 3 audio files placed in the skill directory:

File Purpose Example
{song}高潮上清唱.wav Upper half vocal (a cappella) — drives avatar lip-sync 十年高潮上清唱.wav
{song}高潮上伴奏.MP3 Upper half accompaniment — plays in sync with avatar 十年高潮上伴奏.MP3
{song}高潮下伴奏.MP3 Lower half accompaniment — plays when user sings 十年高潮下伴奏.MP3

How to prepare these files:

  • Use any audio editing tool (e.g. Audacity, Adobe Audition) to split songs into upper/lower halves
  • Use vocal separation tools (e.g. UVR, Demucs) to extract a cappella (vocal-only) from the upper half
  • Export accompaniment as MP3, vocal as WAV (any FFmpeg-supported format works)
  • Place all files in the skill directory (same folder as server.mjs)

4. Song Registry (Required)

After preparing audio files, register each song in server.mjs SONGS object:

const SONGS = {
  "十年": {
    artist: "陈奕迅",
    acappella_upper: "十年高潮上清唱.wav",
    accomp_upper: "十年高潮上伴奏.MP3",
    accomp_lower: "十年高潮下伴奏.MP3",
  },
  // Add more songs...
};

5. FFmpeg (Required)

Install FFmpeg for audio format conversion:

# macOS
brew install ffmpeg
# Ubuntu/Debian
sudo apt install ffmpeg

6. Node.js 18+ (Required)

node --version  # must be >= 18

Quick Start

  1. Complete all preparation steps above
  2. Copy example song files from assets/songs/ to the skill directory:
    cp \x3Cskill-dir>/assets/songs/* \x3Cskill-dir>/
    
    This includes a ready-to-use demo song: 十年 (陈奕迅) with vocal, upper and lower accompaniment files.
  3. Start the server:
    node \x3Cskill-dir>/server.mjs
    
  4. Open http://localhost:3098 in browser
  5. Enter NuwaAI credentials (if not pre-configured)
  6. Pick a song and start singing!

Included Example Song

The skill ships with one demo song in assets/songs/:

  • 十年高潮上清唱.wav — vocal (a cappella)
  • 十年高潮上伴奏.MP3 — upper half accompaniment
  • 十年高潮下伴奏.MP3 — lower half accompaniment

Copy them to the skill root directory to use. The song "十年" is pre-registered in server.mjs.

NuwaAI Integration

Uses humanctrl WebSocket with ASR enabled. Avatar control message format:

{
  "type": "avatar",
  "data": {
    "content": "",
    "audio": {
      "segment": 0,
      "speech": "\x3Cbase64 PCM>",
      "music": "\x3Cbase64 PCM>"
    }
  }
}
  • speech: Vocal-only PCM driving lip-sync (≤10KB per chunk)
  • music: Same as speech for a cappella mode
  • segment: Chunk index, -1 = last chunk
  • Audio: 16kHz mono PCM, base64 encoded

Features

  • 🎤 Avatar lip-sync singing via NuwaAI humanctrl
  • 🎵 Synchronized accompaniment playback
  • 🗣️ ASR voice capture for user singing
  • 🎯 Fun battle scoring system
  • 🎁 Blind box rewards
  • ⏸️ Interruptible speech (not during singing)
  • 📱 Responsive web interface

Requirements

  • Node.js 18+
  • FFmpeg (for audio → PCM conversion)
  • NuwaAI account with avatar
  • Modern browser (WebRTC + microphone)
安全使用建议
What to consider before installing: - Do not run the code until you review/remove embedded keys: the package contains hard-coded API keys (in chorus_agent.py and a demo key in server.mjs). These keys are unexpected; using them could expose your usage to a third party. If you or your org ever used those keys, rotate them immediately. - Clarify which component you intend to run: SKILL.md and Quick Start describe running the Node server (server.mjs) and a browser UI. There is also a separate Python agent (chorus_agent.py) that is not mentioned in the README and behaves differently (plays audio via afplay, uses a different songs folder, calls the LLM service directly). If you only want the Node web UI, do not run the Python agent. - Remove or replace hard-coded credentials and use environment variables: server.mjs already supports env vars for DASHSCOPE_*, QWEN_MODEL and stores Nuwa credentials in .nuwa-config.json. Replace any embedded keys with your own keys stored in environment variables or local config files and never commit secrets to code. - Inspect network calls and endpoints: the code talks to Dashscope-compatible endpoints (https://dashscope.aliyuncs.com/...) and expects NuwaAI humanctrl WebSocket. Confirm these endpoints are what you expect and that no unexpected remote hosts are contacted. - Run in an isolated environment: if you decide to test, run the code in a sandbox or container, do not expose any sensitive credentials, and avoid running chorus_agent.py unless you fully audit it. Check for platform-specific commands (afplay) and remove/modify them for your OS. - File placement mismatch: make sure song files are placed where the server actually expects them (SKILL.md says skill root; python agent expects a '清唱' directory). Harmonize filenames/paths before use. - If you lack confidence in the origin (homepage unknown, source unknown), prefer not to run it on production machines. Ask the publisher/maintainer for clarification about the Python agent, the embedded keys, and confirm which files are required.
功能分析
Type: OpenClaw Skill Name: digital-singer Version: 1.3.0 The skill contains a critical Remote Code Execution (RCE) vulnerability in `server.mjs` where the `/api/song/pcm` endpoint passes unsanitized user input directly into a shell command via `execSync` and `ffmpeg`. Additionally, multiple files (`server.mjs` and `chorus_agent.py`) contain hardcoded API keys for DashScope and NuwaAI, which is a significant security risk. While these appear to be unintentional security flaws rather than deliberate malware, the lack of input validation and the exposure of sensitive credentials make the bundle highly risky.
能力标签
requires-sensitive-credentials
能力评估
Purpose & Capability
The SKILL.md and server.mjs describe a Node-based web server (server.mjs) that drives a browser front-end and asks the user for NuwaAI credentials and an LLM API. However the package also includes an unrelated-sounding Python agent (chorus_agent.py) with its own hard-coded Dashscope API key and different song directory conventions. The presence of a Python agent and baked-in credentials is not justified by the Node-centric Quick Start in SKILL.md and appears incoherent.
Instruction Scope
SKILL.md instructs users to run node server.mjs and to place song files in the skill root; it does not mention running chorus_agent.py. chorus_agent.py on the other hand expects audio files in a '清唱' folder, calls out to Dashscope using an embedded API key, and runs local playback via afplay (macOS). The instructions do not mention these behaviors, so runtime behavior could differ from what users expect.
Install Mechanism
There is no install spec (instruction-only), which limits automatic installation risk. The package.json declares one dependency (ws) which is reasonable for a WebSocket front-end. Node.js and ffmpeg are required per SKILL.md — this is proportionate to the media-processing use case. No remote downloads or extract-from-URL installs are present.
Credentials
SKILL.md asks for NuwaAI API key and LLM configuration (DASHSCOPE_BASE_URL, DASHSCOPE_API_KEY). Those are reasonable. But the codebase contains hard-coded API keys (see chorus_agent.py and server.mjs DEMO_CONFIG) embedded in files rather than declared as required env variables; this both leaks secrets and suggests the shipped code may call external LLM endpoints without the user's explicit configuration. The skill also does not declare required env vars in metadata despite needing LLM and NuwaAPI credentials.
Persistence & Privilege
always is false and there is no indication the skill force-enables itself. The server writes a local .nuwa-config.json to persist user-entered NuwaAI credentials which is consistent with the described UI behavior. No other system-wide or cross-skill modifications are present.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install digital-singer
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /digital-singer 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.3.0
Removed private deployment URLs and API keys from defaults. Sanitized .nuwa-config.json to use placeholders.
v1.2.0
Added example song (十年 by 陈奕迅) in assets/songs/ so users can try immediately
v1.1.0
Improved user preparation docs: clear step-by-step guide for NuwaAI account, LLM API, song audio files, song registry, FFmpeg and Node.js requirements
v1.0.0
Initial release: NuwaAI avatar singing with lip-sync, duet battle mode, accompaniment sync, ASR voice capture, scoring and blind box rewards
元数据
Slug digital-singer
版本 1.3.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 4
常见问题

Digital Singer 是什么?

Turn your NuwaAI digital avatar into a singing performer! The avatar sings with lip-synced mouth movements driven by vocal audio, with synchronized backgroun... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 75 次。

如何安装 Digital Singer?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install digital-singer」即可一键安装,无需额外配置。

Digital Singer 是免费的吗?

是的,Digital Singer 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Digital Singer 支持哪些平台?

Digital Singer 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Digital Singer?

由 jianglingling007(@jianglingling007)开发并维护,当前版本 v1.3.0。

💬 留言讨论