← Back to Skills Marketplace
jeffli2002

PPT to Video Generator

by jeffli2002 · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ Security Clean
107
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install ppt2video
Description
Convert PowerPoint presentations into narrated videos with Chinese voiceover, synchronized subtitles, and page-by-page audio sync. Use this skill when the us...
README (SKILL.md)

PPT to Video Generator

将PPT课件自动转换为带中文旁白、同步字幕、音画精确对齐的讲解视频。

触发场景

  • 用户上传PPT文件并要求生成视频
  • 需要制作课程讲解视频、培训视频、演示视频
  • 提到 "PPT转视频"、"课件视频"、"讲解视频"

核心原则

  1. PPT画面为主:PPT本身已包含标题和要点,不叠加额外文字动画(避免重叠)
  2. 音画精确同步:每页独立音频,画面时长 = 音频精确时长
  3. 字幕安全区:底部15%独立区域,不与PPT画面重叠
  4. 中文全流程:旁白、字幕均为简体中文

工作流程

Step 1: 提取PPT图片

from pptx import Presentation
from pptx.enum.shapes import MSO_SHAPE_TYPE
import os

prs = Presentation("input.pptx")
output_dir = "public/slides"
os.makedirs(output_dir, exist_ok=True)

for i, slide in enumerate(prs.slides, 1):
    for shape in slide.shapes:
        if shape.shape_type == MSO_SHAPE_TYPE.PICTURE:
            image = shape.image
            filepath = os.path.join(output_dir, f"slide_{i:02d}.{image.ext}")
            with open(filepath, "wb") as f:
                f.write(image.blob)
            print(f"Slide {i}: {filepath}")
            break

注意:如果PPT是文字型(非图片型),需要额外截图或导出为图片。

Step 2: 编写逐页讲解词

原则

  • 每页5-10秒讲解词
  • 口语化,避免书面语
  • 关键数据要强调
  • 与PPT内容对应,不添加PPT上没有的信息

格式

页1:今天聊AI视频的双轨实践
页2:感性路线Seedance用AI画画,理性路线Remotion用代码控制
页3:Seedance四大能力,但单次只支持四到十五秒
...

Step 3: 生成逐页音频

关键:每页独立生成音频片段,不要生成一条全长音频

cd audio/pages
edge-tts --voice zh-CN-XiaoxiaoNeural --text "今天聊AI视频的双轨实践" --write-media p01.mp3
edge-tts --voice zh-CN-XiaoxiaoNeural --text "感性路线Seedance用AI画画" --write-media p02.mp3
# ... 每页一条

推荐语音zh-CN-XiaoxiaoNeural(女声,专业清晰)

Step 4: 测量音频时长并计算帧数

for f in p*.mp3; do
  duration=$(ffprobe -v error -show_entries format=duration -of csv=p=0 "$f")
  frames=$(python3 -c "print(int(float('$duration') * 24 + 0.5))")
  printf "%s: %.3fs = %d frames\
" "$f" "$duration" "$frames"
done

计算公式frames = int(duration_seconds * 24 + 0.5)

Step 5: 创建Remotion项目

项目结构

remotion-ppt-video/
├── src/
│   ├── index.tsx          # registerRoot
│   └── PPTVideo.tsx       # 主组件
├── public/
│   ├── slides/            # PPT图片
│   │   ├── slide_01.png
│   │   └── ...
│   └── audio/             # 逐页音频
│       ├── p01.mp3
│       └── ...
├── audio/
│   └── pages/             # 音频源文件
├── out/                   # 输出目录
├── remotion.config.ts
└── tsconfig.json

PPTVideo.tsx 核心结构

const SLIDES = [
  { img: 'slides/slide_01.png', text: '今天聊AI视频的双轨实践', audio: 'audio/p01.mp3', frames: 75 },
  // ... 每页对应一条
];

// 计算累计起始帧
const starts: number[] = [];
let acc = 0;
for (const s of SLIDES) {
  starts.push(acc);
  acc += s.frames;
}
export const TOTAL_FRAMES = acc;

布局规范

  • PPT画面:上部85%,objectFit: 'contain'
  • 字幕区:底部15%,独立深色背景 #0a0a14
  • 分隔线:borderTop: '1px solid rgba(255,255,255,0.06)'
  • 页码指示器:右上角,当前页橙色 #ff6b35

Sequence使用

\x3CSequence from={starts[index]} durationInFrames={slide.frames}>
  \x3CAudio src={staticFile(slide.audio)} volume={0.95} />
  \x3CSlideScene ... />
\x3C/Sequence>

Step 6: 渲染视频

npx remotion render src/index.tsx ppt-video out/video.mp4 --overwrite --concurrency=1

VPS优化参数

  • 分辨率:854×480(内存友好)
  • 帧率:24fps
  • 并发:1(避免OOM)

关键技术点

音画同步

问题 解决方案
旁白跨页 每页独立音频
画面切换与旁白不对齐 durationInFrames = 音频秒数 × fps
字幕与旁白不同步 每页字幕严格对应该页旁白

字幕安全区

┌─────────────────────────┐
│                         │
│    PPT画面 (85%)        │
│    完整显示,无遮挡      │
│                         │
├─────────────────────────┤  ← 分隔线
│    字幕安全区 (15%)     │
│    独立底色,不重叠      │
└─────────────────────────┘

PPT画面处理

  • 如果PPT文字已渲染为图片:直接提取使用
  • 如果PPT是文字+形状:导出为PNG/截图
  • 画面始终 objectFit: 'contain' 保持比例
  • 背景色:#0f0f1a(与字幕区 #0a0a14 区分)

输出规范

  • 格式:MP4 (H.264)
  • 分辨率:854×480
  • 帧率:24fps
  • 音频:AAC,单声道或立体声
  • 字幕:内嵌画面(底部15%区域)

常见问题

Q: PPT提取出来没有文字? A: PPT可能是图片型(文字已渲染为图像),需要用OCR识别或重新制作文字层。

Q: 音频总时长超过90秒? A: 精简讲解词,每页控制在5-8秒。关键信息优先,细节可省略。

Q: 渲染时内存不足? A: 降低分辨率到854×480,帧率24fps,并发设为1。

Q: 字幕和PPT底部文字重叠? A: 检查是否正确设置了85%/15%分区。PPT画面必须在85%区域内。

Q: 音画不同步? A: 确认每页独立音频,且 durationInFrames 精确等于音频时长×fps。不要用单条全长音频。

参数速查

参数 推荐值 说明
fps 24 流畅且节省资源
分辨率 854×480 VPS安全渲染
画面比例 85% 上部PPT画面
字幕比例 15% 底部字幕安全区
语音 zh-CN-XiaoxiaoNeural 女声,专业
淡入时长 0.3秒 画面自然过渡
字幕淡入延迟 0.2秒 画面先出现
Usage Guidance
Before installing or using this skill, make sure the required tools are installed from trusted sources and pinned where possible. Use it on PPT files you are comfortable processing with the selected TTS tool, especially if the slides contain private or confidential information.
Capability Analysis
Type: OpenClaw Skill Name: ppt2video Version: 1.0.0 The skill bundle provides a legitimate workflow for converting PowerPoint presentations into narrated videos using standard tools like python-pptx, edge-tts, and Remotion. The provided Python and shell scripts are well-structured for the stated purpose, and there is no evidence of data exfiltration, malicious execution, or prompt injection intended to compromise the agent or the host environment.
Capability Assessment
Purpose & Capability
The described capabilities—extracting slide images, generating Chinese narration, measuring audio duration, composing subtitles, and rendering MP4 output—match the stated PPT-to-video purpose.
Instruction Scope
The workflow includes Python and shell commands that read the user's PPT-derived files and write media outputs. These are purpose-aligned, but should be run only on intended files and reviewed before execution.
Install Mechanism
The registry says there is no install spec and no required binaries, while the instructions rely on tools such as python-pptx, edge-tts, ffprobe, npx, and Remotion. Users must manage and verify those dependencies themselves.
Credentials
The skill writes project output folders such as public/slides, audio/pages, and out, and uses TTS tooling for narration. This is proportionate to video generation, but may involve slide-derived text leaving the local workflow depending on the TTS tool.
Persistence & Privilege
No background service, autostart behavior, credentials, elevated privileges, or persistent agent behavior is described.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install ppt2video
  3. After installation, invoke the skill by name or use /ppt2video
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release: Convert PPT presentations to narrated videos with Chinese voiceover, synchronized subtitles, and page-by-page audio sync
Metadata
Slug ppt2video
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is PPT to Video Generator?

Convert PowerPoint presentations into narrated videos with Chinese voiceover, synchronized subtitles, and page-by-page audio sync. Use this skill when the us... It is an AI Agent Skill for Claude Code / OpenClaw, with 107 downloads so far.

How do I install PPT to Video Generator?

Run "/install ppt2video" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is PPT to Video Generator free?

Yes, PPT to Video Generator is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does PPT to Video Generator support?

PPT to Video Generator is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created PPT to Video Generator?

It is built and maintained by jeffli2002 (@jeffli2002); the current version is v1.0.0.

💬 Comments