← Back to Skills Marketplace

Conversation Video

Name: Conversation Video
Author: pratyushchauhan

by Pratyush Chauhan · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ✓ Security Clean

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install conversation-video

Description

Generate animated conversation videos with multi-voice TTS audio and timed text overlays. Use when the user needs to (1) turn a transcript or dialogue into a...

README (SKILL.md)

Conversation Video

Generate multi-voice conversation videos from text transcripts. Two paths: quick ffmpeg (no dependencies) or rich Remotion (React animations).

Prerequisites

Tool	Path / Notes
ffmpeg	System install or Jellyfin ffmpeg at `/usr/lib/jellyfin-ffmpeg/ffmpeg`
supertonic-tts	Python package for multi-voice TTS (see scripts/generate_audio.py for load logic)
Node.js + npm	Only needed for Remotion path

Workflow

1. Build a transcript manifest

Create a JSON file with your conversation:

[
  {"speaker": "NARRATOR",   "text": "Customer Discovery Interview", "voice": "M1", "speed": 1.0, "align": "center"},
  {"speaker": "INTERVIEWER","text": "Walk me through when you first realized...", "voice": "M5", "speed": 0.95, "align": "left"},
  {"speaker": "CUSTOMER",   "text": "I was looking for a marketer agent.", "voice": "M2", "speed": 1.0, "align": "right"}
]

Fields: speaker (label), text (spoken text), voice (supertonic voice name e.g. M1-M5, F1-F2), speed (optional playback speed), align (left/right/center for video placement).

2. Generate audio + timing manifest

python scripts/generate_audio.py manifest.json output.wav

Outputs:

output.wav — concatenated multi-voice audio
output_timings.json — per-segment start/end times for video sync

3. Render video (choose path)

Path A: ffmpeg — fast, no Node.js needed

python scripts/ffmpeg_render.py output_timings.json output.wav video.mp4

Options: --width, --height, --font-size, --bg, --font, --crf

Path B: Remotion — richer animations, React-based

Copy the boilerplate:

cp -r assets/remotion-boilerplate ./my-video
cd my-video
npm install

Edit src/Conversation.tsx:

Replace conversation array with your lines (duration in frames, 30fps)
Set SpeakerConfig colors/alignment
Uncomment \x3CAudio src={staticFile("audio.wav")} /> and place audio in public/

Render:

npx remotion render src/index.ts Conversation out/video.mp4

Speaker Customization

Default color/alignment map (edit in either ffmpeg or Remotion):

Speaker	Color	Align
NARRATOR	#cbd5e1	center
INTERVIEWER	#60a5fa	left
CUSTOMER	#34d399	right

Add more by extending the config map in the respective renderer.

Resources

scripts/generate_audio.py — Multi-voice TTS with timing export
scripts/ffmpeg_render.py — ffmpeg drawtext video renderer
assets/remotion-boilerplate/ — Copyable Remotion project template
references/remotion-patterns.md — Advanced Remotion techniques (JSON data loading, word-by-word reveal, audio sync)
references/ffmpeg-guide.md — ffmpeg drawtext syntax and timing reference

Usage Guidance

Install only if you are comfortable running local media-generation commands and npm install for the optional Remotion template. Review transcript contents before use, because generated audio, timing JSON, terminal logs, and temporary WAV files may contain the spoken text.

Capability Assessment

✓ Purpose & Capability

The artifacts consistently support the stated purpose: generating conversation videos from transcript manifests using TTS, ffmpeg, and optional Remotion animation templates.

✓ Instruction Scope

The runtime steps are explicit and user-directed; the skill shows concrete commands for generating audio, rendering video, copying a boilerplate project, and optionally rendering with Remotion.

ℹ Install Mechanism

The Remotion path requires npm install for declared public packages, and the TTS script depends on supertonic-tts with possible model download behavior; these are disclosed as prerequisites and fit the purpose.

ℹ Credentials

The skill uses local Python scripts, ffmpeg subprocesses, temporary WAV files, and output media files, which are proportionate for video rendering but can process potentially sensitive transcript content locally.

ℹ Persistence & Privilege

No background persistence, credential access, privilege escalation, or hidden startup behavior was found; generated audio/video outputs and temporary audio work files are expected side effects.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install conversation-video
After installation, invoke the skill by name or use /conversation-video
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

Initial release: multi-voice TTS audio + timed text overlay video via ffmpeg or Remotion

Metadata

Slug conversation-video

Version 1.0.0

License MIT-0

All-time Installs 1

Active Installs 1

Total Versions 1

Frequently Asked Questions

What is Conversation Video?

Generate animated conversation videos with multi-voice TTS audio and timed text overlays. Use when the user needs to (1) turn a transcript or dialogue into a... It is an AI Agent Skill for Claude Code / OpenClaw, with 37 downloads so far.

How do I install Conversation Video?

Run "/install conversation-video" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Conversation Video free?

Yes, Conversation Video is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Conversation Video support?

Conversation Video is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Conversation Video?

It is built and maintained by Pratyush Chauhan (@pratyushchauhan); the current version is v1.0.0.

More Skills

Conversation Video

Conversation Video

Prerequisites

Workflow

1. Build a transcript manifest

2. Generate audio + timing manifest

3. Render video (choose path)

Speaker Customization

Resources

What is Conversation Video?

How do I install Conversation Video?

Is Conversation Video free?

Which platforms does Conversation Video support?

Who created Conversation Video?

💬 Comments