← 返回 Skills 市场
oliviapp8

Digital Avatar

作者 Olivia_Pp · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
159
总下载
2
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install digital-avatar
功能描述
数字人/虚拟形象生成和口播视频制作。支持多个后端:可灵 Kling、即梦 Jimeng、HeyGen、D-ID、Synthesia。输入形象描述或真人照片,输出数字人资源ID或口播视频片段。触发词:数字人、虚拟人、AI主播、avatar、口播视频、talking head。
使用说明 (SKILL.md)

数字人生成

功能

  1. 创建数字人形象:从描述或照片生成数字人
  2. 声纹克隆(可选):上传音频样本 → 克隆声纹
  3. 生成口播视频:数字人 + 台词/音频 → 口播视频片段

语音来源选项

方式 说明 推荐场景
平台内置 TTS 用平台预设声音 快速测试
上传音频 提供录好的音频文件 有现成配音
平台声纹克隆 上传样本 → 平台克隆 推荐,全链路统一
外部 TTS 用 MiniMax 等生成后上传 平台不支持克隆时

推荐:优先用平台自带的声纹克隆,保持后端一致性。

⚠️ 重要:后端一致性原则

同一个数字人项目必须全程使用同一个后端!

  • 可灵的 avatar_id 和即梦的不互通
  • 即梦创建的形象,可灵用不了,反之亦然
  • 选定后端后,从创建形象到生成口播都用同一个

支持的后端

后端 数字人 口播 声纹克隆 特点
Kling 可灵 质量高,国产首选
Jimeng 即梦 快,中文口型好,剪映生态
HeyGen 模板丰富,出海/英文
D-ID - 简单快速
Synthesia 企业级,多语言

推荐:国内项目优先用可灵或即梦,二选一后全程使用。

工作流程

流程 A:创建数字人

输入: 形象描述 / 真人照片
  ↓
选择后端
  ↓
调用 API 生成
  ↓
输出: avatar_id + 预览图

流程 B:生成口播视频

输入: avatar_id + 台词文本/音频
  ↓
调用后端口播 API
  ↓
等待渲染
  ↓
输出: 视频文件 URL

输入参数

创建数字人

参数 必填 说明
mode create
backend - kling / jimeng / heygen / d-id / synthesia
description 形象描述(二选一)
photo 真人照片路径(二选一)
style - realistic / cartoon / 3d
gender - male / female

声纹克隆(可选)

参数 必填 说明
mode voice_clone
backend - kling / jimeng(需支持)
audio_sample 音频样本(10s-3min)
name - 声纹名称

输出:voice_id,后续生成口播时使用。

生成口播视频

参数 必填 说明
mode generate
backend - 同上
avatar_id 数字人 ID
text 台词文本(三选一)
audio 音频文件路径(三选一)
voice_id 克隆声纹 ID + text(三选一)
emotion - neutral / happy / serious
speed - 语速 0.5-2.0(默认1.0)

输出格式

创建数字人

avatar:
  id: "avatar_abc123"
  backend: jimeng
  preview_url: "https://..."
  style: realistic
  created_at: "2024-01-01T00:00:00Z"

生成口播视频

video:
  id: "video_xyz789"
  avatar_id: "avatar_abc123"
  url: "https://..."
  duration: 15.5
  status: completed

后端配置

openclaw.json 中配置(只需配置你选用的后端):

Kling 可灵(推荐)

{
  "kling": {
    "access_key": "your_access_key",
    "secret_key": "your_secret_key"
  }
}

Jimeng 即梦

{
  "jimeng": {
    "api_key": "ak-xxxxxxxx"
  }
}

HeyGen

{
  "heygen": {
    "api_key": "xxx"
  }
}

详见 references/backend-setup.md

使用示例

从描述创建

用户:帮我创建一个数字人,25岁左右的职业女性,干练短发

执行:
1. mode=create, description="25岁职业女性,干练短发", style=realistic
2. 调用 Jimeng API
3. 返回 avatar_id

从照片创建

用户:用这张照片创建数字人 [附图]

执行:
1. mode=create, photo=\x3C图片路径>
2. 调用 API 上传照片
3. 返回 avatar_id

生成口播

用户:用 avatar_abc123 说这段话:"大家好,今天教大家..."

执行:
1. mode=generate, avatar_id="avatar_abc123", text="大家好..."
2. 调用口播 API
3. 等待渲染完成
4. 返回视频 URL

与上下游对接

上游video-script-generator 输出的 narration 字段

下游scene-video-generator / video-stitcher 消费口播视频

注意事项

  1. 真人照片需获得授权
  2. 商用需确认后端的版权协议
  3. 口播视频渲染可能需要 1-5 分钟
  4. 建议缓存 avatar_id 避免重复创建
安全使用建议
This skill appears to be what it claims (creating avatars and talking-head videos), but it has two important issues you should resolve before installing or using it: 1) Manifest mismatch: SKILL.md requires API keys/config in openclaw.json (Kling, Jimeng, HeyGen, D‑ID, Synthesia), but the skill metadata declares no required credentials or config paths. Ask the author to explicitly declare which credentials/config locations the skill will read and to update the registry metadata. 2) Sensitive uploads: the skill will upload user photos and voice samples to third‑party services for avatar creation and voice cloning. Only supply media you own or have explicit permission to use. For voice cloning, follow legal and ethical rules for consent. If possible, create limited-scope API keys for each backend and verify vendor data-retention/privacy policies before use. Additional recommended checks: - Confirm the exact path/name of openclaw.json the skill will read and whether it will read any other files. - Ask where avatar/video URLs are hosted and how long third parties retain uploaded media. - Prefer creating per-skill or per-project API keys with minimal permissions and rotate/delete them if you stop using the skill. If the author cannot clarify or update the manifest to list required credentials/config paths, treat the skill as untrusted for sensitive data.
功能分析
Type: OpenClaw Skill Name: digital-avatar Version: 1.0.0 The skill bundle contains documentation and metadata for a digital avatar and video generation tool. It provides instructions for an AI agent to interface with legitimate third-party services such as Kling, Jimeng, HeyGen, and D-ID. There is no executable code provided, and the instructions in SKILL.md and backend-setup.md are strictly aligned with the stated purpose of generating AI avatars, with no evidence of prompt injection, data exfiltration, or malicious intent.
能力评估
Purpose & Capability
The skill's stated purpose (creating digital avatars and spoken videos) legitimately requires API credentials for the listed backends (Kling, Jimeng, HeyGen, D-ID, Synthesia). However, the registry metadata declares no required environment variables, no primary credential, and no required config paths — despite SKILL.md instructing the agent to read API keys from openclaw.json. That mismatch (manifest declares no credentials but runtime instructions expect them) is incoherent.
Instruction Scope
SKILL.md explicitly instructs uploading user photos and audio samples to third-party APIs and reading configuration from openclaw.json. Uploading personally identifiable images and voice samples and performing voice cloning are sensitive actions but are in-scope for this feature. The concern is the instructions require access to user media and a local config file that the registry did not declare; the skill will transmit user content to external services (the documented vendor domains).
Install Mechanism
This is an instruction-only skill with no install spec and no code files — lowest install risk. There is no downloader or extracted archive; nothing is written or installed by the skill itself.
Credentials
The SKILL.md and references/backend-setup.md require multiple third-party API keys (Kling access_key/secret_key, Jimeng api_key, HeyGen/D-ID/Synthesia keys) to operate. The skill metadata lists no required env vars or primary credential and no required config paths, so the credential access is not declared. Requiring many unrelated service keys without declaring them in the manifest is disproportionate and reduces transparency.
Persistence & Privilege
always is false and the skill is user-invocable; it does not request permanent inclusion. The SKILL.md suggests caching avatar_id for convenience (normal behavior) but does not ask to modify other skills or system-wide settings.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install digital-avatar
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /digital-avatar 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release
元数据
Slug digital-avatar
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

Digital Avatar 是什么?

数字人/虚拟形象生成和口播视频制作。支持多个后端:可灵 Kling、即梦 Jimeng、HeyGen、D-ID、Synthesia。输入形象描述或真人照片,输出数字人资源ID或口播视频片段。触发词:数字人、虚拟人、AI主播、avatar、口播视频、talking head。 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 159 次。

如何安装 Digital Avatar?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install digital-avatar」即可一键安装,无需额外配置。

Digital Avatar 是免费的吗?

是的,Digital Avatar 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Digital Avatar 支持哪些平台?

Digital Avatar 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Digital Avatar?

由 Olivia_Pp(@oliviapp8)开发并维护,当前版本 v1.0.0。

💬 留言讨论