← Back to Skills Marketplace
lewislulu

PPT Presenter — 带逐字稿的演讲级PPT生成器

by luuuyi · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
425
Downloads
0
Stars
1
Active Installs
1
Versions
Install in OpenClaw
/install ai-ppt-presenter
Description
Generate professional HTML presentations with slide content, whiteboard-style images, and detailed word-for-word speaker scripts in presenter view for confid...
README (SKILL.md)

PPT Presenter — 带逐字稿的演讲级PPT生成器

为每一页生成完整演讲逐字稿 + 演讲者视图,让你自信上台不怯场。

Pipeline Overview

User Topic/Notes
     ↓
1. Content Planning (outline + page count)
     ↓
2. Per-Slide Content (title, board description, speaker notes)
     ↓
3. Image Generation (Gemini 3 Pro Image, one per slide)
     ↓
4. HTML Assembly (reveal.js + lightbox + presenter view)
     ↓
5. Speaker Scripts (word-for-word 逐字稿 in \x3Caside class="notes">)

Step 1: Content Planning

From user's topic, notes, or markdown files, produce an outline:

  • Determine audience (technical level, mixed audiences)
  • Plan 15-25 slides grouped into 4-7 sections
  • Each section gets a divider slide (colored section number + title)
  • Structure: Opening hook → Sections → Quick Start → Takeaway + Q&A

If user provides existing markdown/text content, read it and restructure into slide outline. Ask user to confirm page count and structure before proceeding.

Step 2: Per-Slide Content

For each slide, define:

  • Page title — concise, with emoji prefix
  • Visual description — what to draw on a whiteboard (colors, layout, diagrams)
  • Key content — bullet points, tables, code blocks, comparison charts
  • Speaker script — 150-300 words per slide, conversational tone, can be read aloud directly

For mixed audiences, write three-layer explanations:

  • 编程小白 (beginners): metaphors, daily-life analogies
  • 职场白领 (business): ROI, efficiency, practical scenarios
  • 开发人员 (developers): architecture, code, protocols

Write content to a markdown file first for user review, then proceed to image generation.

Step 3: Image Generation

Generate one whiteboard-style illustration per slide using Gemini API.

Use the script: scripts/generate_slide_images.py

python3 scripts/generate_slide_images.py \
  --prompts-file prompts.json \
  --output-dir ./images \
  --api-key "$GEMINI_API_KEY"

Prompt Guidelines

  • Style: "whiteboard marker drawing, hand-drawn educational style, colorful markers (red, blue, green, orange)"
  • Include Chinese text in prompts when audience is Chinese
  • Describe specific visual elements: diagrams, flowcharts, comparison tables, icons
  • Keep prompts under 500 chars for best results

API Details

  • Model: gemini-3-pro-image-preview (preferred) or imagen-4.0-generate-001
  • Endpoint: https://generativelanguage.googleapis.com/v1beta/models/{model}:generateContent
  • Request body for Gemini 3:
    {
      "contents": [{"parts": [{"text": "Generate this image: {prompt}"}]}],
      "generationConfig": {"responseModalities": ["IMAGE", "TEXT"]}
    }
    
  • Response: candidates[0].content.parts[].inlineData.data (base64)
  • Rate limit: add 2-3 second delay between requests
  • Retry on network errors (SSL EOF, connection reset)

Finding API Key

Check TOOLS.md for Gemini Image Generation section, or ask user for API key.

Step 4: HTML Assembly

Use the template: assets/reveal-template.html

Structure

\x3C!DOCTYPE html>
\x3Chtml lang="zh-CN">
\x3Chead>
  \x3C!-- reveal.js 5.1.0 from CDN -->
  \x3Clink rel="stylesheet" href="https://cdn.jsdelivr.net/npm/[email protected]/dist/reveal.css">
  \x3Clink rel="stylesheet" href="https://cdn.jsdelivr.net/npm/[email protected]/dist/theme/black.css">
  \x3C!-- Custom styles (see template) -->
\x3C/head>
\x3Cbody>
\x3Cdiv class="reveal">\x3Cdiv class="slides">
  \x3C!-- Section dividers: colored background gradient + section number -->
  \x3Csection data-background-gradient="...">
    \x3Cdiv class="section-num">1\x3C/div>
    \x3Ch2>Section Title\x3C/h2>
  \x3C/section>
  
  \x3C!-- Content slides -->
  \x3Csection>
    \x3Ch2>Slide Title\x3C/h2>
    \x3C!-- Content: .two-col, .three-col, .card, table, .code-block, .timeline -->
    \x3Caside class="notes">Speaker script here...\x3C/aside>
  \x3C/section>
\x3C/div>\x3C/div>

\x3C!-- Lightbox overlay (see template) -->
\x3C!-- reveal.js init + lightbox JS (see template) -->
\x3C/body>
\x3C/html>

Key CSS Classes (defined in template)

Class Use
.card Dark rounded content box
.two-col / .three-col Flex column layouts
.code-block Styled code with .comment, .cmd, .flag
.warning-box Red-border alert box
.slide-img Clickable image (max-height 420px)
.flow-box + .flow-arrow Horizontal flow diagrams
.timeline + .timeline-item Vertical timeline with colored dots
.tag-blue/green/red/orange/purple Colored pill tags
.accent/.green/.red/.blue/.purple/.yellow/.pink Text colors
.checklist Checkbox-style list
.photo-row Horizontal image gallery
.gradient-text Gradient colored heading

Lightbox Feature

All images with .slide-img, .slide-img-large, .slide-img-full, or inside .photo-row are clickable. Clicking opens fullscreen lightbox with:

  • Left/right navigation (arrows + keyboard)
  • Counter (current/total)
  • Close button + ESC key
  • Pauses reveal.js keyboard while open

The lightbox JS is included in the template.

reveal.js Configuration

Reveal.initialize({
  hash: true,
  slideNumber: 'c/t',
  transition: 'slide',
  width: 1280,
  height: 720,
  margin: 0.06,
  plugins: [RevealHighlight, RevealNotes]
});

Keyboard hints shown at bottom: ← → 翻页 · S 演讲者视图 · F 全屏 · O 总览

Step 5: Speaker Scripts

Add \x3Caside class="notes"> to every content slide (not section dividers).

Script Guidelines

  • Length: 150-300 words per slide (1-2 minutes speaking time)
  • Tone: Conversational, as if speaking to audience directly
  • Structure: Hook → explain visual → key points → transition to next
  • Language: Match audience language (Chinese for Chinese audience)
  • Include: Analogies, real examples, audience engagement ("大家看...")
  • Avoid: Reading bullet points verbatim; scripts should expand on slide content

Presenter View

User presses S to open presenter view showing:

  • Current slide (left)
  • Speaker notes/script (right)
  • Next slide preview
  • Timer

Output Checklist

Before delivering, verify:

  • All slides have titles with emoji prefixes
  • All content slides have \x3Caside class="notes"> with full scripts
  • All slide images generated and saved to images/ directory
  • Images use .slide-img class (clickable lightbox)
  • Section dividers have colored backgrounds and section numbers
  • Lightbox JS is included and functional
  • reveal.js loads from CDN (no local dependencies)
  • File opens correctly in browser (open index.html)
Usage Guidance
This skill is largely coherent with its stated purpose, but review these items before installing: - Expect to provide a Gemini API key (GEMINI_API_KEY) even though the metadata doesn't declare it. Ask the author to update the metadata to list this required credential. - The included script sends the API key in the URL query (?key=...), which can be logged by intermediaries. If you must use this skill, consider passing the key via the --api-key argument only at runtime or modify the script to use an Authorization header instead. - The SKILL.md references TOOLS.md for finding the API key, but no TOOLS.md is bundled — confirm where you should securely store/provide the key. - The skill will send slide prompts (user content) to Google's generative language endpoint; that means content will be transmitted to Google for image generation. Make sure this is acceptable for any sensitive material. - If you are concerned about exfiltration or logging of secrets, inspect/modify scripts locally before running, and supply API keys with least privilege and rotation where possible. If the author can (1) declare GEMINI_API_KEY in the metadata, (2) avoid sending the key in the URL, and (3) include or document the referenced TOOLS.md, the inconsistency would be resolved and the skill would be internally coherent.
Capability Analysis
Type: OpenClaw Skill Name: ai-ppt-presenter Version: 1.0.0 The skill bundle is a legitimate tool for generating HTML presentations using reveal.js and Gemini-powered image generation. The Python script `scripts/generate_slide_images.py` correctly handles API requests to Google's official endpoints (generativelanguage.googleapis.com) and saves images locally. The HTML template `assets/reveal-template.html` uses standard, reputable CDNs for reveal.js and includes a functional lightbox script. No evidence of data exfiltration, malicious execution, or harmful prompt injection was found.
Capability Assessment
Purpose & Capability
Functionality (outline → per-slide content → Gemini image generation → reveal.js assembly) matches the name and description. However, metadata declares no required environment variables despite the SKILL.md and included script depending on a Gemini API key (GEMINI_API_KEY) — an incoherence between claimed requirements and actual capability.
Instruction Scope
SKILL.md instructions stay within the stated purpose: generating slide content, producing one image per slide via Gemini, and assembling an HTML reveal.js presentation. The instructions ask the agent to read user-supplied markdown/text when provided (expected). One problem: SKILL.md points to a TOOLS.md 'Gemini Image Generation' section for finding the API key, but no TOOLS.md is included in the bundle.
Install Mechanism
No install spec is provided (instruction-only install), which is low risk. There is one included script (scripts/generate_slide_images.py) and an HTML template; both are readable and not obfuscated. Nothing is downloaded from arbitrary URLs during install.
Credentials
The skill implicitly requires a Gemini API key (GEMINI_API_KEY) to generate images, but the registry metadata lists no required env vars or primary credential — this is inconsistent and could cause surprise when the key is requested. The included Python script also places the API key in the request URL (?key=...), which is less secure (can be logged by proxies or servers). No other unrelated credentials are requested.
Persistence & Privilege
The skill does not request permanent/always-on presence and uses default autonomous invocation. It does not modify other skill or agent configs and asks for no system-wide paths or elevated privileges.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install ai-ppt-presenter
  3. After installation, invoke the skill by name or use /ai-ppt-presenter
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
AI-powered reveal.js presentation builder with word-for-word speaker scripts and presenter view.
Metadata
Slug ai-ppt-presenter
Version 1.0.0
License MIT-0
All-time Installs 1
Active Installs 1
Total Versions 1
Frequently Asked Questions

What is PPT Presenter — 带逐字稿的演讲级PPT生成器?

Generate professional HTML presentations with slide content, whiteboard-style images, and detailed word-for-word speaker scripts in presenter view for confid... It is an AI Agent Skill for Claude Code / OpenClaw, with 425 downloads so far.

How do I install PPT Presenter — 带逐字稿的演讲级PPT生成器?

Run "/install ai-ppt-presenter" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is PPT Presenter — 带逐字稿的演讲级PPT生成器 free?

Yes, PPT Presenter — 带逐字稿的演讲级PPT生成器 is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does PPT Presenter — 带逐字稿的演讲级PPT生成器 support?

PPT Presenter — 带逐字稿的演讲级PPT生成器 is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created PPT Presenter — 带逐字稿的演讲级PPT生成器?

It is built and maintained by luuuyi (@lewislulu); the current version is v1.0.0.

💬 Comments