Conversation Video
/install conversation-video
Conversation Video
Generate multi-voice conversation videos from text transcripts. Two paths: quick ffmpeg (no dependencies) or rich Remotion (React animations).
Prerequisites
| Tool | Path / Notes |
|---|---|
| ffmpeg | System install or Jellyfin ffmpeg at /usr/lib/jellyfin-ffmpeg/ffmpeg |
| supertonic-tts | Python package for multi-voice TTS (see scripts/generate_audio.py for load logic) |
| Node.js + npm | Only needed for Remotion path |
Workflow
1. Build a transcript manifest
Create a JSON file with your conversation:
[
{"speaker": "NARRATOR", "text": "Customer Discovery Interview", "voice": "M1", "speed": 1.0, "align": "center"},
{"speaker": "INTERVIEWER","text": "Walk me through when you first realized...", "voice": "M5", "speed": 0.95, "align": "left"},
{"speaker": "CUSTOMER", "text": "I was looking for a marketer agent.", "voice": "M2", "speed": 1.0, "align": "right"}
]
Fields: speaker (label), text (spoken text), voice (supertonic voice name e.g. M1-M5, F1-F2), speed (optional playback speed), align (left/right/center for video placement).
2. Generate audio + timing manifest
python scripts/generate_audio.py manifest.json output.wav
Outputs:
output.wav— concatenated multi-voice audiooutput_timings.json— per-segment start/end times for video sync
3. Render video (choose path)
Path A: ffmpeg — fast, no Node.js needed
python scripts/ffmpeg_render.py output_timings.json output.wav video.mp4
Options: --width, --height, --font-size, --bg, --font, --crf
Path B: Remotion — richer animations, React-based
Copy the boilerplate:
cp -r assets/remotion-boilerplate ./my-video
cd my-video
npm install
Edit src/Conversation.tsx:
- Replace
conversationarray with your lines (duration in frames, 30fps) - Set
SpeakerConfigcolors/alignment - Uncomment
\x3CAudio src={staticFile("audio.wav")} />and place audio inpublic/
Render:
npx remotion render src/index.ts Conversation out/video.mp4
Speaker Customization
Default color/alignment map (edit in either ffmpeg or Remotion):
| Speaker | Color | Align |
|---|---|---|
| NARRATOR | #cbd5e1 | center |
| INTERVIEWER | #60a5fa | left |
| CUSTOMER | #34d399 | right |
Add more by extending the config map in the respective renderer.
Resources
- scripts/generate_audio.py — Multi-voice TTS with timing export
- scripts/ffmpeg_render.py — ffmpeg drawtext video renderer
- assets/remotion-boilerplate/ — Copyable Remotion project template
- references/remotion-patterns.md — Advanced Remotion techniques (JSON data loading, word-by-word reveal, audio sync)
- references/ffmpeg-guide.md — ffmpeg drawtext syntax and timing reference
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install conversation-video - 安装完成后,直接呼叫该 Skill 的名称或使用
/conversation-video触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
Conversation Video 是什么?
Generate animated conversation videos with multi-voice TTS audio and timed text overlays. Use when the user needs to (1) turn a transcript or dialogue into a... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 37 次。
如何安装 Conversation Video?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install conversation-video」即可一键安装,无需额外配置。
Conversation Video 是免费的吗?
是的,Conversation Video 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Conversation Video 支持哪些平台?
Conversation Video 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Conversation Video?
由 Pratyush Chauhan(@pratyushchauhan)开发并维护,当前版本 v1.0.0。