← 返回 Skills 市场

Aliyun Wan Digital Human

Name: Aliyun Wan Digital Human
Author: cinience

作者 cinience · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ⚠ suspicious

125

总下载

当前安装

版本数

在 OpenClaw 中安装

/install aliyun-wan-digital-human

功能描述

Use when generating talking, singing, or presentation videos from a single character image and audio with Alibaba Cloud Model Studio digital-human model `wan...

使用说明 (SKILL.md)

Category: provider

Model Studio Digital Human

Validation

mkdir -p output/aliyun-wan-digital-human
python -m py_compile skills/ai/video/aliyun-wan-digital-human/scripts/prepare_digital_human_request.py && echo "py_compile_ok" > output/aliyun-wan-digital-human/validate.txt

Pass criteria: command exits 0 and output/aliyun-wan-digital-human/validate.txt is generated.

Output And Evidence

Save normalized request payloads, chosen resolution, and task polling snapshots under output/aliyun-wan-digital-human/.
Record image/audio URLs and whether the input image passed detection.

Use this skill for image + audio driven speaking, singing, or presenting characters.

Critical model names

Use these exact model strings:

wan2.2-s2v-detect
wan2.2-s2v

Selection guidance:

Run wan2.2-s2v-detect first to validate the image.
Use wan2.2-s2v for the actual video generation job.

Prerequisites

China mainland (Beijing) only.
Set DASHSCOPE_API_KEY in your environment, or add dashscope_api_key to ~/.alibabacloud/credentials.
Input audio should contain clear speech or singing, and input image should depict a clear subject.

Normalized interface (video.digital_human)

Detect Request

model (string, optional): default wan2.2-s2v-detect
image_url (string, required)

Generate Request

model (string, optional): default wan2.2-s2v
image_url (string, required)
audio_url (string, required)
resolution (string, optional): 480P or 720P
scenario (string, optional): talk, sing, or perform

Response

task_id (string)
task_status (string)
video_url (string, when finished)

Quick start

python skills/ai/video/aliyun-wan-digital-human/scripts/prepare_digital_human_request.py \
  --image-url "https://example.com/anchor.png" \
  --audio-url "https://example.com/voice.mp3" \
  --resolution 720P \
  --scenario talk

Operational guidance

Use a portrait, half-body, or full-body image with a clear face and stable framing.
Match audio length to the desired output duration; the output follows the audio length up to the model limit.
Keep image and audio as public HTTP/HTTPS URLs.
If the image fails detection, do not proceed directly to video generation.

Output location

Default output: output/aliyun-wan-digital-human/request.json
Override base dir with OUTPUT_DIR.

References

references/sources.md

安全使用建议

What to check before installing or using this skill: - Expectation vs. code: the included script only creates a request.json and does not call any remote API, perform detection, or poll tasks — SKILL.md describes additional runtime steps that are not implemented here. Ask: where (which component) actually sends requests to Alibaba Cloud and performs detection/polling? Obtain that code or documentation before sending real credentials or data. - Credential confusion: SKILL.md asks for DASHSCOPE_API_KEY or adding dashscope_api_key to ~/.alibabacloud/credentials, but the registry metadata lists no required env vars. Verify what DASHSCOPE_API_KEY is (an Alibaba Cloud key, a Dashscope proxy key, or something else) and only provide least-privileged keys. Prefer IAM/temporary credentials and avoid placing broad keys in shared ~/.alibabacloud/credentials unless you understand the access scope. - Data privacy: the skill processes images and audio which may be sensitive. Confirm which service and endpoint will receive the media (Alibaba Cloud Model Studio in Beijing region is claimed) and that you are comfortable with regional data handling and retention policies. - Operational testing: run the bundled script in a safe environment to confirm it only writes request.json. Do not provide production credentials until you can trace the complete request flow (who/what will read request.json and where it will be sent). - Ask the publisher: request the full implementation that performs detection and submission (or a homepage/repository) and clarify why the SKILL.md promises outputs and behaviors not present in the bundle. The mismatch could be benign (this is only a helper) or indicate missing/misplaced code that will run elsewhere — confirm before trusting credentials or uploading private media.

功能分析

Type: OpenClaw Skill Name: aliyun-wan-digital-human Version: 1.0.0 The skill is a legitimate utility for preparing request payloads for Alibaba Cloud's digital human video generation service. The Python script `scripts/prepare_digital_human_request.py` safely constructs JSON objects based on user input without performing network calls or accessing sensitive files. The instructions in `SKILL.md` are well-documented and align with the stated purpose of generating videos from images and audio.

能力评估

⚠ Purpose & Capability

The name/description (generate talking/singing avatar video via Alibaba Cloud Model Studio) matches the included script which only prepares a JSON request payload. However the SKILL.md also demands a DASHSCOPE_API_KEY or an entry in ~/.alibabacloud/credentials even though the repository metadata declares no required env vars and the included script does not read credentials or make network calls. That mismatch is unexpected and unexplained.

⚠ Instruction Scope

SKILL.md instructs validating images, running a detection model first, polling tasks, and saving normalized payloads, chosen resolution, and task polling snapshots under output/. The provided script only writes a single request.json and prints its path; it does not perform detection, polling, network calls, or record the additional evidence SKILL.md promises. The instructions also tell users to set DASHSCOPE_API_KEY or a credentials file entry, but no code in this bundle uses those values.

✓ Install Mechanism

This is instruction-only with a tiny helper script and no install spec — minimal disk/runtime footprint and no archived downloads. Low install risk.

⚠ Credentials

Registry metadata claims no required env vars, but SKILL.md requires DASHSCOPE_API_KEY (or a dashscope_api_key in ~/.alibabacloud/credentials). Requesting cloud credentials is reasonable for a cloud integration, but the specific variable name and the mismatch with metadata are unexplained. It's unclear whether DASHSCOPE_API_KEY is an Alibaba Cloud credential, a third-party proxy, or something else — this should be clarified before trusting keys.

✓ Persistence & Privilege

The skill does not request always: true, does not include install scripts that modify global agent settings, and is user-invocable only. No elevated persistence or system-wide changes are present.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install aliyun-wan-digital-human
安装完成后，直接呼叫该 Skill 的名称或使用 /aliyun-wan-digital-human 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

Initial release of aliyun-wan-digital-human skill. - Enables generation of talking, singing, or presentation videos from a character image and audio using Alibaba Cloud Model Studio digital-human models. - Supports image validation and video generation workflows with distinct model names: `wan2.2-s2v-detect` for validation and `wan2.2-s2v` for video. - Exposes a normalized interface for detection and video creation requests. - Requires API key setup and China (Beijing) region. - Outputs all requests, responses, and task snapshots to a dedicated directory for traceability.

元数据

Slug aliyun-wan-digital-human

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题