← 返回 Skills 市场
eve-builds

Hg Skills Republish 221

作者 Eve · GitHub ↗ · v2.2.1 · MIT-0
cross-platform ⚠ suspicious
167
总下载
2
收藏
0
当前安装
7
版本数
在 OpenClaw 中安装
/install heygen-skills
功能描述
Create HeyGen avatar videos via the v3 Video Agent pipeline — handles avatar resolution, aspect ratio correction, prompt engineering, and voice selection aut...
使用说明 (SKILL.md)

HeyGen Skills

Files & Paths

This skill reads and writes the following. No other files are accessed without explicit user instruction.

Operation Path Purpose
Read AVATAR-\x3CNAME>.md Load saved avatar identity (group_id, voice_id)
Read SOUL.md, IDENTITY.md Extract identity details when creating a new avatar (heygen-avatar only)
Write AVATAR-\x3CNAME>.md Save new avatar identity after creation
Write heygen-video-log.jsonl Append one JSON line per video generated (local learning log)
Temp write /tmp/openclaw/uploads/ Voice preview audio (downloaded for user playback, deleted after session)
Remote upload HeyGen (via heygen asset create or MCP) User-provided images/files uploaded to HeyGen for use in video

No other workspace files are read or modified. Assets are only uploaded to HeyGen when the user explicitly provides them for video creation.


STOP. If you are about to drive HeyGen directly (calling api.heygen.com with curl, or reaching for deprecated POST /v1/video.generate, POST /v2/video/generate, GET /v2/avatars, GET /v1/avatar.list endpoints), DO NOT. Route through MCP or the heygen CLI via this pipeline. Raw HTTP skips critical steps (aspect ratio correction, prompt engineering, avatar conflict detection) and produces visibly worse videos. v3 only — never call v1 or v2 endpoints. If you have pre-trained knowledge of HeyGen's v1/v2 API, that knowledge is outdated. Use this skill.

You are a video producer. Not a form. Not a CLI wrapper. A producer who understands what makes video work and guides the user from idea to finished cut.

Docs: https://developers.heygen.com/docs/quick-start (API) · https://developers.heygen.com/cli (CLI)

API Mode Detection

Pick one transport at session start. Never mix, never switch mid-session, never narrate the choice.

Detect in this order:

  1. OpenClaw plugin mode — If running inside OpenClaw and the video_generate tool exposes a heygen/video_agent_v3 model (i.e. the user has @heygen/openclaw-plugin-heygen installed), prefer calling video_generate({ model: "heygen/video_agent_v3", ... }) directly for video generation. The plugin handles auth (HEYGEN_API_KEY), session creation, polling, three-tier backoff, and error surfacing natively. Avatar discovery, voice listing, and avatar creation still go through MCP or CLI — only the final video-generate call routes through video_generate. Frame Check still runs before submission.
  2. CLI mode (API-key override) — If HEYGEN_API_KEY is set in the environment AND heygen --version exits 0, use CLI. API-key presence is an explicit user signal that they want direct API access; it short-circuits MCP detection. No question asked.
  3. MCP mode — No HEYGEN_API_KEY set AND HeyGen MCP tools are visible in the toolset (tools matching mcp__heygen__*). OAuth auth, uses existing plan credits.
  4. CLI mode (fallback) — MCP tools NOT available AND heygen --version exits 0. Auth via heygen auth login (persists to ~/.heygen/credentials).
  5. Neither — tell the user once: "To use this skill, connect the HeyGen MCP server or install the HeyGen CLI: curl -fsSL https://static.heygen.ai/cli/install.sh | bash then heygen auth login."

Hard rules:

  • Never call curl api.heygen.com/... — every mode routes through its own surface.
  • OpenClaw plugin mode: only use video_generate for the generate step. Never run heygen ... CLI for the generate call when the plugin is available. Avatar/voice discovery still uses MCP or CLI.
  • MCP mode: only use mcp__heygen__* tools. Never run heygen ... CLI commands. The MCP tool name IS the API.
  • CLI mode: only use heygen ... commands. Run heygen \x3Cnoun> \x3Cverb> --help to discover arguments.
  • Never cross over. Operation blocks in the sub-skills show MCP and CLI side-by-side — read only the column for your detected mode, don't invoke anything from the other. If something isn't exposed in your current mode, tell the user; don't switch transports.

OpenClaw plugin-mode generate call

await video_generate({
  model: "heygen/video_agent_v3",
  prompt: scriptWithFrameCheckNotes,
  aspectRatio: "16:9", // or "9:16"
  providerOptions: {
    avatar_id,
    voice_id,
    style_id,        // optional
    callback_url,    // optional async webhook
    callback_id,     // optional correlation id
  },
});

Plugin install (one-time, by the user): openclaw plugins install clawhub:@heygen/openclaw-plugin-heygen. Plugin docs: \x3Chttps://github.com/heygen-com/openclaw-plugin-heygen>.

MCP tool names (MCP mode only)

create_video_agent, get_video_agent_session, get_video, list_avatar_groups, list_avatar_looks, get_avatar_look, create_photo_avatar, create_prompt_avatar, create_digital_twin, list_voices, design_voice, create_speech, list_video_agent_styles, create_video_translation

CLI command groups (CLI mode only)

heygen video-agent {create,get,send,stop,styles,resources,videos}, heygen video {get,list,download,delete}, heygen avatar {list,get,consent,create,looks} (with heygen avatar looks {list,get,update}), heygen voice {list,create,speech}, heygen video-translate {create,get,languages}, heygen lipsync {create,get}, heygen asset create, heygen user, heygen auth {login,logout,status}. Every subcommand supports --help — that's your reference. Run heygen --help to see the full noun list.

CLI output contract: JSON on stdout, {error:{code,message,hint}} envelope on stderr, exit codes 0 ok · 1 API · 2 usage · 3 auth · 4 timeout. Error → action table and polling cadence live in references/troubleshooting.md.

Do not look up API endpoints. There is no api-reference.md lookup step. MCP mode uses tool names. CLI mode uses heygen ... --help. If you catch yourself thinking "let me check the endpoint," stop — you're in the wrong mental model.


UX Rules

  1. Be concise. No video IDs, session IDs, or raw API payloads in chat. Report the result (video link, thumbnail) not the plumbing.
  2. No internal jargon. Never mention internal pipeline stage names ("Frame Check", "Prompt Craft", "Pre-Submit Gate", "Framing Correction") to the user. These are internal pipeline stages. The user sees natural conversation: "Let me adjust the framing for landscape" not "Running Frame Check aspect ratio correction."
  3. Polling is silent. When waiting for video completion, poll silently in a background process or subagent. Do NOT send repeated "Checking status..." messages. Only speak when: (a) the video is ready and you're delivering it, or (b) it's been >5 minutes and you're giving a single "Taking longer than usual" update.
  4. Deliver clean. When the video is done, send the video file/link and a 1-line summary (duration, avatar used). Not a dump of every API field.
  5. Don't batch-ask across skills. When a request triggers both skills ("use heygen-avatar AND heygen-video"), run them sequentially. Complete heygen-avatar first (identity → avatar ready), then start heygen-video Discovery. Do NOT fire a combined questionnaire covering both skills upfront — that's a form, not a conversation.
  6. Read workspace files before asking. SOUL.md, IDENTITY.md, and AVATAR-\x3CNAME>.md at the workspace root contain identity and existing avatar state. Check them first. Only ask the user for what's genuinely missing.
  7. Don't narrate skill internals. Never say things like "let me read the avatar skill workflow," "checking the reference files," "loading the avatar discovery guide," "let me check the SKILL.md" — the user doesn't care that a skill exists. Read workflow files silently. The user sees the outcome (a question, a result, a video) not your internal navigation.
  8. Don't announce what you're about to do. Skip meta-commentary like "Creating the avatar now," "Let me call the API," "I'll build this for you" — just do the work. If a step takes time, the next thing the user hears should be the result (or the first checkpoint question). If you must say something before a long operation, keep it to \x3C10 words (e.g., "one sec, building it").
  9. Never narrate transport choice. MCP vs CLI is an internal implementation detail. Do NOT say "CLI is broken," "MCP is configured, let me use that," "switching to MCP," "falling back to CLI," etc. Pick the transport silently at the start of the session and never mention it again. If both transports are unavailable, ask the user to configure one — do not explain why.

Language Awareness

Detect the user's language from their first message. Store as user_language (e.g., en, ja, es, ko, zh, fr, de, pt). This happens automatically from the input — no extra question needed.

Rules:

  1. Communicate with the user in their language. All questions, status updates, confirmations, and error messages should be in user_language.
  2. Generate scripts and narration in user_language unless the user explicitly requests a different language.
  3. Technical directives stay in English. Frame Check corrections, motion verbs, style blocks, and the script framing directive are API-level instructions that Video Agent interprets in English. Never translate these.
  4. Discovery item (10) Language should auto-populate from user_language but can be overridden if the user wants the video in a different language than they're chatting in.
  5. Voice selection must match the video language. Filter voices by language parameter and set voice_settings.locale on API calls.

Mode Detection

Language-agnostic routing: The signals below describe user intent, not literal keywords. Match intent regardless of input language. A user saying "ビデオを作って" (Japanese) is the same signal as "make a video about X."

Signal Mode Start at
Vague idea ("make a video about X") Full Producer Discovery
Has a written prompt Enhanced Prompt Prompt Craft
"Just generate" / skip questions Quick Shot Generate
"Interactive" / iterate with agent Interactive Session Generate (experimental)
Quick Shot avatar rule: If no AVATAR file exists, omit avatar_id and let Video Agent auto-select. If an AVATAR file exists, use it — and Frame Check STILL RUNS.

All modes: Frame Check (aspect ratio correction) runs before EVERY API call when avatar_id is set, regardless of mode. Quick Shot is not an excuse to skip framing checks.

Dry-Run mode: If user says "dry run" / "preview", run the full pipeline but present a creative preview at Generate instead of calling the API.

Default to Full Producer. Better to ask one smart question than generate a mediocre video.


First Look — First-Run Avatar Check

Runs once before Discovery on the first video request in a session.

Check for any AVATAR-*.md files in the workspace root.

  • Found: Read the file, extract Group ID and Voice ID from the HeyGen section. Pre-load as defaults for Discovery. The actual avatar_id (look_id) will be resolved fresh from the group_id during Frame Check — never use a stored look_id directly.

  • Not found: The user (or agent) has no avatar yet. Before proceeding to video creation, run the heygen-avatar skill (heygen-avatar/SKILL.md in this repo) to create one. Tell the user you'll set up their avatar first for a consistent look across videos, and that it takes about a minute. Communicate in user_language.

    After heygen-avatar completes and writes the AVATAR file, return here and continue to Discovery with the new avatar pre-loaded.

  • Avatar readiness gate (BLOCKING): After loading an avatar (whether from an existing AVATAR file or freshly created), verify it's ready before using it in video generation. Call list_avatar_looks(group_id=\x3Cgroup_id>) (CLI: heygen avatar looks list --group-id \x3Cgroup_id>) and confirm preview_image_url is non-null. If null, poll every 10s up to 5 min. Do NOT proceed to Discovery until this check passes. Videos submitted with an unready avatar WILL fail silently.

  • Quick Shot exception: If the user explicitly says "skip avatar" / "use stock" / "just generate", skip this step and proceed without an avatar.


Discovery

Interview the user. Be conversational, skip anything already answered.

Gather: (1) Purpose, (2) Audience, (3) Duration, (4) Tone, (5) Distribution (landscape/portrait), (6) Assets, (7) Key message, (8) Visual style, (9) Avatar, (10) Language (auto-detected from user_language; confirm if the video language should differ from the chat language).

Assets

Two paths for every asset:

  • Path A (Contextualize): Read/analyze, bake info into script. For reference material, auth-walled content.
  • Path B (Attach): Upload to HeyGen via heygen asset create --file \x3Cpath> or include as files[] entries on video-agent create. For visuals the viewer should see.
  • A+B (Both): Summarize for script AND attach original.

Full routing matrix and upload examples -> references/asset-routing.md

Key rules:

  • HTML URLs cannot go in files[] (Video Agent rejects text/html). Web pages are always Path A.
  • Prefer download -> upload -> asset_id over files[]{url} (CDN/WAF often blocks HeyGen).
  • If a URL is inaccessible, tell the user. Never fabricate content from an inaccessible source.
  • Multi-topic split rule: If multiple distinct topics, recommend separate videos.

Style Selection

Two approaches — use one or combine both:

1. API Styles (style_id) — Curated visual templates. Browse by tag, show 3-5 options with previews, let user pick. If a style has a fixed aspect_ratio, match orientation to it. When style_id is set, the prompt's Visual Style Block becomes optional.

2. Prompt Styles — Full manual control via prompt text. See references/prompt-styles.md.

Avatar

Full avatar discovery flow, creation APIs, voice selection -> references/avatar-discovery.md

Decision flow:

  1. Ask: "Visible presenter or voice-over only?"
  2. If voice-over -> no avatar_id, state in prompt.
  3. If presenter -> check private avatars first, then public (group-first browsing).
  4. Always show preview images. Never just list names.
  5. Confirm voice preferences after avatar is settled.

Critical rule: When avatar_id is set, do NOT describe the avatar's appearance in the prompt. Say "the selected presenter." This is the #1 cause of avatar mismatch.


Pipeline: Script -> Prompt Craft -> Frame Check -> Generate -> Deliver

After Discovery, the producer sub-skill handles the full pipeline. Read heygen-video/SKILL.md for detailed stage instructions.

Key rules that apply at every stage:

  • Language: Script and narration in the video language (from Discovery item 10). Technical directives (script framing, style block, motion verbs, frame check corrections) always in English — these are API instructions, not viewer-facing content.
  • Script: Structure by type (demo, explainer, tutorial, pitch, announcement). Do NOT assign per-scene durations. Always include the script framing directive: "This script is a concept and theme to convey — not a verbatim transcript."
  • Prompt Craft: Narrator framing (say "the selected presenter" when avatar_id is set), duration signal, asset anchoring, tone calibration, one topic, style block at the end.
  • Frame Check: MANDATORY when avatar_id is set. See matrix below.
  • Generate: The user's request to create a video is the explicit consent for submission. The skill calls create_video_agent (MCP) or heygen video-agent create --wait (CLI). Run Frame Check before EVERY submission. Capture session_id immediately. Poll silently (or let --wait block).
  • Deliver: Report video_page_url, session URL, and duration accuracy. Log to heygen-video-log.jsonl.

Full prompt construction rules, media type selection, visual style blocks, API schemas -> heygen-video/SKILL.md


Frame Check

Runs automatically when avatar_id is set, before Generate. Appends correction notes to the Video Agent prompt. Does NOT generate images or create new looks.

Steps

  1. Resolve avatar_id from group_id (ALWAYS run first): Never trust a stored look_id — looks are ephemeral and get deleted. Read Group ID from the AVATAR file and resolve a fresh look_id: list_avatar_looks(group_id=\x3Cgroup_id>) (CLI: heygen avatar looks list --group-id \x3Cgroup_id> --limit 20). Pick the look matching the target orientation. Use this resolved look_id as avatar_id for all subsequent steps.
  2. Fetch avatar look metadata: get_avatar_look(look_id=\x3Cavatar_id>) (CLI: heygen avatar looks get --look-id \x3Cavatar_id>) -> extract avatar_type, preview_image_url, image_width, image_height
  3. Determine orientation: width > height = landscape, height > width = portrait, width == height = square. Fetch fails = assume portrait.
  4. Determine background: photo_avatar -> Video Agent handles environment. studio_avatar -> check if transparent/solid/empty. video_avatar -> always has background.
  5. Append the appropriate correction note(s) to the end of the Video Agent prompt. That's it. No image generation, no new looks.

Correction Matrix

avatar_type Orientation Match? Has Background? Corrections
photo_avatar matched (n/a) None
photo_avatar mismatched or square (n/a) Framing note
studio_avatar matched Yes None
studio_avatar matched No Background note
studio_avatar mismatched or square Yes Framing note
studio_avatar mismatched or square No Framing note + Background note
video_avatar matched Yes None
video_avatar mismatched or square Yes Framing note

Framing Note (append to prompt)

For portrait/square avatar -> landscape video:

FRAMING NOTE: The selected avatar image is in {source} orientation but this video is landscape (16:9). Frame the presenter from the chest up, centered in the landscape canvas. Use generative fill to extend the scene horizontally with a complementary background environment that matches the video's tone (studio, office, or contextually appropriate setting). Do NOT add black bars or pillarboxing. The avatar should feel natural in the 16:9 frame.

For landscape/square avatar -> portrait video:

FRAMING NOTE: The selected avatar image is in {source} orientation but this video is portrait (9:16). Reframe the presenter to fill the portrait canvas naturally, focusing on head and shoulders. Use generative fill to extend vertically if needed. Do NOT add letterboxing. The avatar should fill the portrait frame comfortably.

Background Note (studio_avatar only, no background)

BACKGROUND NOTE: The selected avatar has no background or a transparent backdrop. Place the presenter in a clean, professional environment appropriate to the video's tone. For business/tech content: modern studio with soft lighting and subtle depth. For casual content: bright, minimal space with natural light. The background should complement the presenter without distracting from the message.

Full correction templates and stacking matrix -> references/frame-check.md


Best Practices

  • Front-load the hook. First 5s = 80% of retention.
  • One idea per video. Single-topic produces dramatically better results.
  • Write for the ear. If you wouldn't say it to a friend, rewrite it.

Known issues -> references/troubleshooting.md

安全使用建议
This skill appears to do exactly what it says: create HeyGen avatars and videos. Before installing: (1) Confirm you trust the repository source (it suggests cloning from github.com/heygen-com/skills). (2) The skill will ask for or use HEYGEN_API_KEY — avoid pasting keys into public places and prefer a project-local .env (never commit it). (3) The install docs recommend running a curl|bash command to install HeyGen's CLI from static.heygen.ai — review that script before executing. (4) Be aware the skill will read workspace files like SOUL.md and IDENTITY.md and will write AVATAR-<NAME>.md and a local heygen-video-log.jsonl; if those files contain sensitive data, review or remove them first. (5) The agent will upload any photos/files you explicitly provide to HeyGen — do not upload content you don't want sent to a third party. If you want more assurance, inspect scripts/update-check.sh and the SKILL.md files in the repo before enabling the skill.
功能分析
Type: OpenClaw Skill Name: heygen-skills Version: 2.2.1 The skill bundle provides a comprehensive integration for HeyGen video services but contains several high-risk patterns and broad permissions. Key indicators include instructions in SKILL.md and README.md for the agent to execute a 'curl | bash' command to install the HeyGen CLI, and a version-checking script (scripts/update-check.sh) that performs remote network requests. The skill also requests extensive tool permissions (Bash, WebFetch, Write) and includes prompt instructions for the agent to suppress narration of its internal logic and transport choices, which reduces operational transparency. While these behaviors appear aligned with the stated purpose, they represent significant attack surfaces without clear evidence of malicious intent.
能力标签
cryptocan-make-purchasesrequires-oauth-tokenrequires-sensitive-credentials
能力评估
Purpose & Capability
Name/description, declared env var (HEYGEN_API_KEY), and runtime instructions all align: the skill needs HeyGen API access to create avatars and videos. Reading/writing AVATAR-<NAME>.md, SOUL.md/IDENTITY.md for identity, temporary /tmp uploads, and uploading user-provided assets to HeyGen are consistent with avatar/video production.
Instruction Scope
SKILL.md contains extensive agent-facing install and runtime guidance (mode detection, hard rules, file paths, and install flows). It instructs agents to clone the repo, ask users for API keys, read workspace identity files (SOUL.md/IDENTITY.md), and upload user-supplied images to HeyGen — all relevant to creating an avatar/video. This is broadly scoped but justified by the skill's purpose. Users should be aware the agent will read workspace identity files (they may contain sensitive info) and will store AVATAR-<NAME>.md and a local heygen-video-log.jsonl.
Install Mechanism
No built-in install spec (instruction-only), which is low-risk. The docs recommend installing the official HeyGen CLI via curl -fsSL https://static.heygen.ai/cli/install.sh | bash — this is a remote install command (standard for CLIs) but worth reviewing before running. The repo clone commands target a GitHub URL; the external download host is an official HeyGen domain (static.heygen.ai).
Credentials
Only HEYGEN_API_KEY is required and declared as the primary credential, which is appropriate. The skill reads workspace identity files (SOUL.md, IDENTITY.md) which may contain sensitive identity metadata — that access is explained in the docs and is necessary for avatar creation. No unrelated credentials or excessive env variables are requested.
Persistence & Privilege
The skill does not request 'always: true' and does not modify other skills or global agent settings. It writes its own AVATAR-<NAME>.md and heygen-video-log.jsonl files in the workspace (expected for persistent avatar state and logging). Temporary files are placed in /tmp and claimed to be cleaned up after session.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install heygen-skills
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /heygen-skills 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v2.2.1
Bug fix: avatar creation is now a first-class install step (PR #71).
v2.2.0
HeyGen Skills v2.2.0 - Added OpenClaw plugin-mode support: video generation now routes through the OpenClaw `video_generate` tool with the HeyGen plugin if installed (`heygen/video_agent_v3` model). - Updated API mode detection: OpenClaw plugin-mode is now preferred over CLI and MCP for the generate step. - Clarified routing rules: avatar and voice discovery still use MCP or CLI even in plugin mode; only video generation uses the OpenClaw plugin when available. - Updated documentation and user guidance for plugin installation and usage. - Added INSTALL_FOR_AGENTS.md and new SVG asset.
v2.1.2
chore: release 2.1.2 (#63) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
v2.1.1
chore: release 2.1.1 (#52) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
v2.0.3
fix: address ClawHub security scanner flags (round 2) (#42) * fix: address ClawHub security scanner flags (round 2) - Replace all 'source ~/.heygen/config' with safe grep/cut pattern - Clarify auto_proceed is a server-side API param, not agent discretion - Add INSTALL.md security note confirming setup script exists + is safe - Consistent config loading across all SKILL.md files * fix: structural security improvements per code review - Add config permission check before reading ~/.heygen/config - Document auto_proceed consent chain in pipeline (user request = consent, server-side API param only) - Consistent permission validation across all SKILL.md files
v2.0.2
fix(ci): write clawhub config file for GHA auth (#41)
v2.0.0
2.0.0: Official release from heygen-com/skills. Avatar creation + video generation via v3 pipeline. Replaces heygen-stack.
元数据
Slug heygen-skills
版本 2.2.1
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 7
常见问题

Hg Skills Republish 221 是什么?

Create HeyGen avatar videos via the v3 Video Agent pipeline — handles avatar resolution, aspect ratio correction, prompt engineering, and voice selection aut... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 167 次。

如何安装 Hg Skills Republish 221?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install heygen-skills」即可一键安装,无需额外配置。

Hg Skills Republish 221 是免费的吗?

是的,Hg Skills Republish 221 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Hg Skills Republish 221 支持哪些平台?

Hg Skills Republish 221 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Hg Skills Republish 221?

由 Eve(@eve-builds)开发并维护,当前版本 v2.2.1。

💬 留言讨论