Explainer

Name: Explainer
Author: 0xfango

Description

Create explainer videos with narration and AI-generated visuals. Triggers on: "解说视频", "explainer video", "explain this as a video", "tutorial video", "introd...

README (SKILL.md)

When to Use

User wants to create an explainer or tutorial video
User asks to "explain" something in video form
User wants narrated content with AI-generated visuals
User says "explainer video", "解说视频", "tutorial video"

When NOT to Use

User wants audio-only content without visuals (use /speech or /podcast)
User wants a podcast-style discussion (use /podcast)
User wants to generate a standalone image (use /image-gen)
User wants to read text aloud without video (use /speech)

Purpose

Generate explainer videos that combine a single narrator's voiceover with AI-generated visuals. Ideal for product introductions, concept explanations, and tutorials. Supports text-only script generation or full text + video output.

Hard Constraints

No shell scripts. Construct curl commands from the API reference files listed in Resources
Always read shared/authentication.md for API key and headers
Follow shared/common-patterns.md for polling, errors, and interaction patterns
Always read config following shared/config-pattern.md before any interaction
Never hardcode speaker IDs — always fetch from the speakers API
Never save files to ~/Downloads/ — use .listenhub/explainer/ from config
Explainer uses exactly 1 speaker
Mode must be info (for Info style) or story (for Story style) — never slides (use /slides skill instead)

\x3CHARD-GATE> Use the AskUserQuestion tool for every multiple-choice step — do NOT print options as plain text. Ask one question at a time. Wait for the user's answer before proceeding to the next step. After all parameters are collected, summarize the choices and ask the user to confirm. Do NOT call any generation API until the user has explicitly confirmed.

\x3C/HARD-GATE>

Step -1: API Key Check

Follow shared/config-pattern.md § API Key Check. If the key is missing, stop immediately.

Step 0: Config Setup

Follow shared/config-pattern.md Step 0.

If file doesn't exist — ask location, then create immediately:

mkdir -p ".listenhub/explainer"
echo '{"outputDir":".listenhub","outputMode":"inline","language":null,"defaultStyle":null,"defaultSpeakers":{}}' > ".listenhub/explainer/config.json"
CONFIG_PATH=".listenhub/explainer/config.json"
# (or $HOME/.listenhub/explainer/config.json for global)

Then run Setup Flow below.

If file exists — read config, display summary, and confirm:

当前配置 (explainer)：
  输出方式：{inline / download / both}
  语言偏好：{zh / en / 未设置}
  默认风格：{info / story / 未设置}
  默认主播：{speakerName / 未设置}

Ask: "使用已保存的配置？" → 确认，直接继续 / 重新配置

Setup Flow (first run or reconfigure)

Ask these questions in order, then save all answers to config at once:

outputMode: Follow shared/output-mode.md § Setup Flow Question.
Language (optional): "默认语言？"
- "中文 (zh)"
- "English (en)"
- "每次手动选择" → keep null
Style (optional): "默认风格？"
- "Info — 信息展示型"
- "Story — 故事叙述型"
- "每次手动选择" → keep null

After collecting answers, save immediately:

# Follow shared/output-mode.md § Save to Config
NEW_CONFIG=$(echo "$CONFIG" | jq --arg m "$OUTPUT_MODE" '. + {"outputMode": $m}')
echo "$NEW_CONFIG" > "$CONFIG_PATH"
CONFIG=$(cat "$CONFIG_PATH")

Note: defaultSpeakers are saved after generation (see After Successful Generation section).

Interaction Flow

Step 1: Topic / Content

Free text input. Ask the user:

What would you like to explain or introduce?

Accept: topic description, text content, or concept to explain.

Step 2: Language

If config.language is set, pre-fill and show in summary — skip this question. Otherwise ask:

Question: "What language?"
Options:
  - "Chinese (zh)" — Content in Mandarin Chinese
  - "English (en)" — Content in English

Step 3: Style

If config.defaultStyle is set, pre-fill and show in summary — skip this question. Otherwise ask:

Question: "What style of explainer?"
Options:
  - "Info" — Informational, factual presentation style
  - "Story" — Narrative, storytelling approach

Step 4: Speaker Selection

Follow shared/speaker-selection.md for the full selection flow, including:

Default from config.defaultSpeakers.{language} (skip step if set)
Text table + free-text input
Input matching and re-prompt on no match

Only 1 speaker is supported for explainer videos.

Step 5: Output Type

Question: "What output do you want?"
Options:
  - "Text script only" — Generate narration script, no video
  - "Text + Video" — Generate full explainer video with AI visuals

Step 6: Confirm & Generate

Summarize all choices:

Ready to generate explainer:

  Topic: {topic}
  Language: {language}
  Style: {info/story}
  Speaker: {speaker name}
  Output: {text only / text + video}

  Proceed?

Wait for explicit confirmation before calling any API.

Workflow

Submit (foreground): POST /storybook/episodes with content, speaker, language, mode → extract episodeId
Tell the user the task is submitted

Poll (background): Run the following exact bash command with run_in_background: true and timeout: 600000. Do NOT use python3, awk, or any other JSON parser — use jq as shown:

EPISODE_ID="\x3Cid-from-step-1>"
for i in $(seq 1 30); do
  RESULT=$(curl -sS "https://api.marswave.ai/openapi/v1/storybook/episodes/$EPISODE_ID" \
    -H "Authorization: Bearer $LISTENHUB_API_KEY" 2>/dev/null)
  STATUS=$(echo "$RESULT" | tr -d '\000-\037\177' | jq -r '.data.processStatus // "pending"')
  case "$STATUS" in
    success|completed) echo "$RESULT"; exit 0 ;;
    failed|error) echo "FAILED: $RESULT" >&2; exit 1 ;;
    *) sleep 10 ;;
  esac
done
echo "TIMEOUT" >&2; exit 2

When notified, download and present script:

Read OUTPUT_MODE from config. Follow shared/output-mode.md for behavior.

inline or both: Present the script inline.

Present:
```
解说脚本已生成！

「{title}」

在线查看：https://listenhub.ai/app/explainer/{episodeId}
```
download or both: Also save the script file.
- Create .listenhub/explainer/YYYY-MM-DD-{episodeId}/
- Write {episodeId}.md from the generated script content
- Present the download path in addition to the above summary.

If video requested: POST /storybook/episodes/{episodeId}/video (foreground) → poll again (background) using the exact bash command below with run_in_background: true and timeout: 600000. Poll for videoStatus, not processStatus:

EPISODE_ID="\x3Cid-from-step-1>"
for i in $(seq 1 30); do
  RESULT=$(curl -sS "https://api.marswave.ai/openapi/v1/storybook/episodes/$EPISODE_ID" \
    -H "Authorization: Bearer $LISTENHUB_API_KEY" 2>/dev/null)
  STATUS=$(echo "$RESULT" | tr -d '\000-\037\177' | jq -r '.data.videoStatus // "pending"')
  case "$STATUS" in
    success|completed) echo "$RESULT"; exit 0 ;;
    failed|error) echo "FAILED: $RESULT" >&2; exit 1 ;;
    *) sleep 10 ;;
  esac
done
echo "TIMEOUT" >&2; exit 2

When notified, download and present result:

Present result

Read OUTPUT_MODE from config. Follow shared/output-mode.md for behavior.

inline or both: Display video URL and audio URL as clickable links.

Present:

解说视频已生成！

视频链接：{videoUrl}
音频链接：{audioUrl}
时长：{duration}s
消耗积分：{credits}

download or both: Also download the audio file.

DATE=$(date +%Y-%m-%d)
JOB_DIR=".listenhub/explainer/${DATE}-{jobId}"
mkdir -p "$JOB_DIR"
curl -sS -o "${JOB_DIR}/{jobId}.mp3" "{audioUrl}"

Present the download path in addition to the above summary.

After Successful Generation

Update config with the choices made this session:

NEW_CONFIG=$(echo "$CONFIG" | jq \
  --arg lang "{language}" \
  --arg style "{info/story}" \
  --arg speakerId "{speakerId}" \
  '. + {"language": $lang, "defaultStyle": $style, "defaultSpeakers": (.defaultSpeakers + {($lang): [$speakerId]})}')
echo "$NEW_CONFIG" > "$CONFIG_PATH"

Estimated times:

Text script only: 2-3 minutes
Text + Video: 3-5 minutes

API Reference

Speaker list: shared/api-speakers.md
Speaker selection guide: shared/speaker-selection.md
Episode creation: shared/api-storybook.md
Polling: shared/common-patterns.md § Async Polling
Config pattern: shared/config-pattern.md

Composability

Invokes: speakers API (for speaker selection); may invoke /speech for voiceover
Invoked by: content-planner (Phase 3)

Example

User: "Create an explainer video introducing Claude Code"

Agent workflow:

Topic: "Claude Code introduction"
Ask language → "English"
Ask style → "Info"
Fetch speakers, user picks "cozy-man-english"
Ask output → "Text + Video"

curl -sS -X POST "https://api.marswave.ai/openapi/v1/storybook/episodes" \
  -H "Authorization: Bearer $LISTENHUB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "sources": [{"type": "text", "content": "Introduce Claude Code: what it is, key features, and how to get started"}],
    "speakers": [{"speakerId": "cozy-man-english"}],
    "language": "en",
    "mode": "info"
  }'

Poll until text is ready, then generate video if requested.

Usage Guidance

This skill is mostly coherent with its purpose but has a few things to check before you install or provide your API key: - Domain mismatch: SKILL.md uses api.marswave.ai while product links reference listenhub.ai — ask the author which domain is the real API and why both appear. - Required binaries: the runtime polling uses curl and jq, but the skill metadata lists no required binaries. Ensure your environment has curl and jq available, or request the skill author to declare them. - Config file behavior: the skill will create and write .listenhub/explainer/config.json (current directory by default, with a comment about $HOME). Confirm where config is stored and what secrets (if any) are written to it. - API key scope: provide a least-privilege LISTENHUB_API_KEY (create a limited key if possible) and verify what actions that key permits on the service. - Background polling: the skill runs a background polling loop (curl + jq) and may poll the API for several minutes — be comfortable with the network activity and endpoints used. If you need to proceed, ask the author to clarify the API domain, explicitly declare required binaries (curl, jq), and show the exact HTTP endpoints and request shapes the skill will call so you can confirm there are no unexpected endpoints or data exfiltration paths.

Capability Analysis

Type: OpenClaw Skill Name: explainer-video Version: 0.1.0 The skill is designed to generate explainer videos by interacting with the Marswave AI API (api.marswave.ai). It follows a structured workflow that includes configuration management, user confirmation gates, and asynchronous polling via bash scripts using curl and jq. All file operations are restricted to a local hidden directory (.listenhub/explainer/), and the use of environment variables (LISTENHUB_API_KEY) is consistent with the skill's stated purpose. No evidence of malicious intent, data exfiltration, or unauthorized execution was found in SKILL.md or the associated documentation.

Capability Assessment

ℹ Purpose & Capability

The skill claims to create explainer videos and requires a LISTENHUB_API_KEY, which is coherent for a hosted video-generation service. However, the SKILL.md references endpoints at api.marswave.ai and a UI URL at listenhub.ai — the mismatch of domains is unexplained and should be clarified by the author.

⚠ Instruction Scope

Instructions will write a config file ('.listenhub/explainer/config.json' or $HOME alternative) and perform network calls using curl and jq (polling loop). The skill's declared requirements list no required binaries, yet the runtime explicitly requires curl and jq. It also mandates reading shared/*.md resources (not included here) and persisting defaultSpeakers after generation. These steps are within the skill's goal but the missing declared binaries and file-write behaviors are inconsistencies to verify.

✓ Install Mechanism

No install spec or code files are included — the skill is instruction-only and does not download or install external packages, which reduces installation risk.

ℹ Credentials

Only one environment variable is required (LISTENHUB_API_KEY), which is appropriate for a remote API. The SKILL.md shows that API key will be sent as a Bearer token to the service. Confirm the API key's scope/permissions before use.

ℹ Persistence & Privilege

The skill writes/reads a local config under .listenhub/explainer and persists defaultSpeakers after generation. It does not request always:true or global privileges. Writing a config file is expected but you should confirm the exact location (current directory vs $HOME) and contents that will be stored.

Version History

v0.1.0

explainer-video v0.1.0 - Initial release: create explainer videos with narration and AI-generated visuals. - Step-by-step, tool-driven config and workflow: collects topic, language, style, speaker, and output type. - Supports both script-only and full video generation, with file output modes (inline, download, or both). - Follows strict interaction and polling patterns as detailed in skill logic. - Uses a single narrator and supports both Chinese and English languages.

Metadata

Slug explainer-video

Version 0.1.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Explainer?

Create explainer videos with narration and AI-generated visuals. Triggers on: "解说视频", "explainer video", "explain this as a video", "tutorial video", "introd... It is an AI Agent Skill for Claude Code / OpenClaw, with 261 downloads so far.

How do I install Explainer?

Run "/install explainer-video" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Explainer free?

Yes, Explainer is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Explainer support?

Explainer is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Explainer?

It is built and maintained by 0xFango (@0xfango); the current version is v0.1.0.

More Skills