← Back to Skills Marketplace
gmsx000-cloud

digital-human-training

by gmsx000-cloud · GitHub ↗ · v1.0.0
cross-platform ✓ Security Clean
437
Downloads
1
Stars
2
Active Installs
1
Versions
Install in OpenClaw
/install digital-human-training
Description
数字人训练与部署 Skill - 提供从语音克隆、唇形同步到实时交互数字人的全流程训练建议与技术支持。
README (SKILL.md)

数字人训练与部署 Skill

提供构建实时交互数字人的全流程指导,涵盖从素材采集到模型训练。

核心能力

  • 🎙️ 语音克隆 (Voice Cloning):指导使用 GPT-SoVITS 或 Fish Speech 进行高保真声音训练。
  • 😶 唇形驱动 (Lip Sync):适配 SadTalker, Live2D 或 Wav2Lip 的技术方案。
  • 🧠 大脑集成 (LLM):将 OpenClaw 的逻辑层与数字人视觉层打通。
  • 实时推理:优化推理延迟,实现 \x3C 500ms 的数字人交互反馈。

技术路线图

  1. 素材准备:高清视频(绿幕背景)、清晰的 1-3 分钟干声采样。
  2. 模型选择
    • 2D 真人:HeyGen 路线或私有化部署 Easy-Wav2Lip。
    • 3D/Live2D:Unity 集成。
  3. 部署方案:Local GPU (Nvidia RTW) vs Cloud API。

Example Usage

指令:我想做一个能实时回答问题的数字人,该怎么选型? 输出

  • 方案 A (自建): GPT-SoVITS (语音) + Easy-Wav2Lip (视觉) + OpenClaw (逻辑)。
  • 方案 B (低代码): HeyGen Streaming API 集成。
  • 关键建议: 注意音频与视频的同步延迟,建议使用流式传输。

由小爱开发 | 数字人项目衍生

Usage Guidance
This skill is a how-to for building voice-cloned, lip-synced digital humans and appears internally consistent, but it handles highly sensitive media and refers to third-party/cloud services. Before using: (1) Ensure you have clear consent from any person whose voice/video will be used; (2) Source models and binaries from trusted repositories (avoid unknown downloads); (3) Keep API keys and cloud credentials private — the skill doesn't manage them, so provide them only to trusted integrations and never paste secrets into public channels; (4) Prefer local processing for high-risk content when feasible; (5) Confirm licenses and terms for third-party tools (e.g., HeyGen, Wav2Lip, SoVITS) and verify data-retention policies if using hosted APIs. If you want a stronger safety assessment, provide any external install scripts or a list of exact third-party endpoints/APIs you plan to integrate so those can be evaluated for coherence and risk.
Capability Analysis
Type: OpenClaw Skill Name: digital-human-training Version: 1.0.0 The skill bundle consists entirely of documentation (`SKILL.md`, `examples/minimal-deployment.md`) describing the process of training and deploying a digital human. There is no executable code, no external network calls, no file system access, and no instructions that could be interpreted as prompt injection attempts against the AI agent. The content is purely informational and aligns with the stated purpose, lacking any high-risk behaviors or indicators of malicious intent.
Capability Assessment
Purpose & Capability
The name/description match the SKILL.md content: it is a how-to for building voice-cloned, lip-synced digital humans. However, the instructions reference cloud services and third-party tools (e.g., HeyGen streaming API, private model deployments) without declaring how API keys or downloads are expected to be supplied — this is plausible for an instruction-only skill but worth noting.
Instruction Scope
The SKILL.md stays on-topic (collect audio/video samples, select models, wire up Whisper/TTS/LLM/Wav2Lip). It explicitly instructs collecting user voice/video samples (sensitive personal data). It does not direct the agent to read unrelated system files, environment variables, or exfiltrate data to hidden endpoints.
Install Mechanism
There is no install spec and no code files; this reduces risk because nothing is automatically written to disk or fetched by the skill. The guidance does recommend using external projects/tools, but those are not installed by the skill itself.
Credentials
The skill declares no required environment variables or credentials. That is consistent with being instruction-only, but some recommended integrations (cloud APIs like HeyGen or model-hosting services) would typically require API keys — the skill does not request them or explain credential handling, so the user must supply and manage those outside the skill.
Persistence & Privilege
always is false and the skill is user-invocable; it does not request permanent presence or attempt to modify other skills or system settings. Autonomous model invocation is allowed by default (normal) and there are no extra privilege requests.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install digital-human-training
  3. After installation, invoke the skill by name or use /digital-human-training
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release — provides end-to-end technical support and guidance for building and deploying interactive digital humans. - Covers the full process from data collection to model training. - Supports voice cloning (using GPT-SoVITS or Fish Speech) and lip-sync (SadTalker, Live2D, Wav2Lip). - Enables logical integration with LLMs (OpenClaw) for real-time interaction. - Recommends optimized deployment methods (local GPU or cloud API) for response times under 500ms. - Includes practical selection advice and example workflows for building live digital characters.
Metadata
Slug digital-human-training
Version 1.0.0
License
All-time Installs 2
Active Installs 2
Total Versions 1
Frequently Asked Questions

What is digital-human-training?

数字人训练与部署 Skill - 提供从语音克隆、唇形同步到实时交互数字人的全流程训练建议与技术支持。 It is an AI Agent Skill for Claude Code / OpenClaw, with 437 downloads so far.

How do I install digital-human-training?

Run "/install digital-human-training" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is digital-human-training free?

Yes, digital-human-training is completely free (open-source). You can download, install and use it at no cost.

Which platforms does digital-human-training support?

digital-human-training is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created digital-human-training?

It is built and maintained by gmsx000-cloud (@gmsx000-cloud); the current version is v1.0.0.

💬 Comments