← 返回 Skills 市场

digital-human-training

Name: digital-human-training
Author: gmsx000-cloud

作者 gmsx000-cloud · GitHub ↗ · v1.0.0

cross-platform ✓ 安全检测通过

437

总下载

当前安装

版本数

在 OpenClaw 中安装

/install digital-human-training

功能描述

数字人训练与部署 Skill - 提供从语音克隆、唇形同步到实时交互数字人的全流程训练建议与技术支持。

使用说明 (SKILL.md)

数字人训练与部署 Skill

提供构建实时交互数字人的全流程指导，涵盖从素材采集到模型训练。

核心能力

🎙️ 语音克隆 (Voice Cloning)：指导使用 GPT-SoVITS 或 Fish Speech 进行高保真声音训练。
😶 唇形驱动 (Lip Sync)：适配 SadTalker, Live2D 或 Wav2Lip 的技术方案。
🧠 大脑集成 (LLM)：将 OpenClaw 的逻辑层与数字人视觉层打通。
⚡ 实时推理：优化推理延迟，实现 \x3C 500ms 的数字人交互反馈。

技术路线图

素材准备：高清视频（绿幕背景）、清晰的 1-3 分钟干声采样。
模型选择：
- 2D 真人：HeyGen 路线或私有化部署 Easy-Wav2Lip。
- 3D/Live2D：Unity 集成。
部署方案：Local GPU (Nvidia RTW) vs Cloud API。

Example Usage

指令：我想做一个能实时回答问题的数字人，该怎么选型？输出：

方案 A (自建): GPT-SoVITS (语音) + Easy-Wav2Lip (视觉) + OpenClaw (逻辑)。
方案 B (低代码): HeyGen Streaming API 集成。
关键建议: 注意音频与视频的同步延迟，建议使用流式传输。

由小爱开发 | 数字人项目衍生

安全使用建议

This skill is a how-to for building voice-cloned, lip-synced digital humans and appears internally consistent, but it handles highly sensitive media and refers to third-party/cloud services. Before using: (1) Ensure you have clear consent from any person whose voice/video will be used; (2) Source models and binaries from trusted repositories (avoid unknown downloads); (3) Keep API keys and cloud credentials private — the skill doesn't manage them, so provide them only to trusted integrations and never paste secrets into public channels; (4) Prefer local processing for high-risk content when feasible; (5) Confirm licenses and terms for third-party tools (e.g., HeyGen, Wav2Lip, SoVITS) and verify data-retention policies if using hosted APIs. If you want a stronger safety assessment, provide any external install scripts or a list of exact third-party endpoints/APIs you plan to integrate so those can be evaluated for coherence and risk.

功能分析

Type: OpenClaw Skill Name: digital-human-training Version: 1.0.0 The skill bundle consists entirely of documentation (`SKILL.md`, `examples/minimal-deployment.md`) describing the process of training and deploying a digital human. There is no executable code, no external network calls, no file system access, and no instructions that could be interpreted as prompt injection attempts against the AI agent. The content is purely informational and aligns with the stated purpose, lacking any high-risk behaviors or indicators of malicious intent.

能力评估

ℹ Purpose & Capability

The name/description match the SKILL.md content: it is a how-to for building voice-cloned, lip-synced digital humans. However, the instructions reference cloud services and third-party tools (e.g., HeyGen streaming API, private model deployments) without declaring how API keys or downloads are expected to be supplied — this is plausible for an instruction-only skill but worth noting.

ℹ Instruction Scope

The SKILL.md stays on-topic (collect audio/video samples, select models, wire up Whisper/TTS/LLM/Wav2Lip). It explicitly instructs collecting user voice/video samples (sensitive personal data). It does not direct the agent to read unrelated system files, environment variables, or exfiltrate data to hidden endpoints.

✓ Install Mechanism

There is no install spec and no code files; this reduces risk because nothing is automatically written to disk or fetched by the skill. The guidance does recommend using external projects/tools, but those are not installed by the skill itself.

ℹ Credentials

The skill declares no required environment variables or credentials. That is consistent with being instruction-only, but some recommended integrations (cloud APIs like HeyGen or model-hosting services) would typically require API keys — the skill does not request them or explain credential handling, so the user must supply and manage those outside the skill.

✓ Persistence & Privilege

always is false and the skill is user-invocable; it does not request permanent presence or attempt to modify other skills or system settings. Autonomous model invocation is allowed by default (normal) and there are no extra privilege requests.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install digital-human-training
安装完成后，直接呼叫该 Skill 的名称或使用 /digital-human-training 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

Initial release — provides end-to-end technical support and guidance for building and deploying interactive digital humans. - Covers the full process from data collection to model training. - Supports voice cloning (using GPT-SoVITS or Fish Speech) and lip-sync (SadTalker, Live2D, Wav2Lip). - Enables logical integration with LLMs (OpenClaw) for real-time interaction. - Recommends optimized deployment methods (local GPU or cloud API) for response times under 500ms. - Includes practical selection advice and example workflows for building live digital characters.

元数据

Slug digital-human-training

版本 1.0.0

许可证 —

累计安装 2

当前安装数 2

历史版本数 1

常见问题