功能描述

Generate professional HTML presentations with slide content, whiteboard-style images, and detailed word-for-word speaker scripts in presenter view for confid...

使用说明 (SKILL.md)

PPT Presenter — 带逐字稿的演讲级PPT生成器

Name: PPT Presenter — 带逐字稿的演讲级PPT生成器
Author: lewislulu

为每一页生成完整演讲逐字稿 + 演讲者视图，让你自信上台不怯场。

Pipeline Overview

User Topic/Notes
     ↓
1. Content Planning (outline + page count)
     ↓
2. Per-Slide Content (title, board description, speaker notes)
     ↓
3. Image Generation (Gemini 3 Pro Image, one per slide)
     ↓
4. HTML Assembly (reveal.js + lightbox + presenter view)
     ↓
5. Speaker Scripts (word-for-word 逐字稿 in \x3Caside class="notes">)

Step 1: Content Planning

From user's topic, notes, or markdown files, produce an outline:

Determine audience (technical level, mixed audiences)
Plan 15-25 slides grouped into 4-7 sections
Each section gets a divider slide (colored section number + title)
Structure: Opening hook → Sections → Quick Start → Takeaway + Q&A

If user provides existing markdown/text content, read it and restructure into slide outline. Ask user to confirm page count and structure before proceeding.

Step 2: Per-Slide Content

For each slide, define:

Page title — concise, with emoji prefix
Visual description — what to draw on a whiteboard (colors, layout, diagrams)
Key content — bullet points, tables, code blocks, comparison charts
Speaker script — 150-300 words per slide, conversational tone, can be read aloud directly

For mixed audiences, write three-layer explanations:

编程小白 (beginners): metaphors, daily-life analogies
职场白领 (business): ROI, efficiency, practical scenarios
开发人员 (developers): architecture, code, protocols

Write content to a markdown file first for user review, then proceed to image generation.

Step 3: Image Generation

Generate one whiteboard-style illustration per slide using Gemini API.

Use the script: scripts/generate_slide_images.py

python3 scripts/generate_slide_images.py \
  --prompts-file prompts.json \
  --output-dir ./images \
  --api-key "$GEMINI_API_KEY"

Prompt Guidelines

Style: "whiteboard marker drawing, hand-drawn educational style, colorful markers (red, blue, green, orange)"
Include Chinese text in prompts when audience is Chinese
Describe specific visual elements: diagrams, flowcharts, comparison tables, icons
Keep prompts under 500 chars for best results

API Details

Model: gemini-3-pro-image-preview (preferred) or imagen-4.0-generate-001
Endpoint: https://generativelanguage.googleapis.com/v1beta/models/{model}:generateContent

Request body for Gemini 3:

{
  "contents": [{"parts": [{"text": "Generate this image: {prompt}"}]}],
  "generationConfig": {"responseModalities": ["IMAGE", "TEXT"]}
}

Response: candidates[0].content.parts[].inlineData.data (base64)
Rate limit: add 2-3 second delay between requests
Retry on network errors (SSL EOF, connection reset)

Finding API Key

Check TOOLS.md for Gemini Image Generation section, or ask user for API key.

Step 4: HTML Assembly

Use the template: assets/reveal-template.html

Structure

\x3C!DOCTYPE html>
\x3Chtml lang="zh-CN">
\x3Chead>
  \x3C!-- reveal.js 5.1.0 from CDN -->
  \x3Clink rel="stylesheet" href="https://cdn.jsdelivr.net/npm/[email protected]/dist/reveal.css">
  \x3Clink rel="stylesheet" href="https://cdn.jsdelivr.net/npm/[email protected]/dist/theme/black.css">
  \x3C!-- Custom styles (see template) -->
\x3C/head>
\x3Cbody>
\x3Cdiv class="reveal">\x3Cdiv class="slides">
  \x3C!-- Section dividers: colored background gradient + section number -->
  \x3Csection data-background-gradient="...">
    \x3Cdiv class="section-num">1\x3C/div>
    \x3Ch2>Section Title\x3C/h2>
  \x3C/section>
  
  \x3C!-- Content slides -->
  \x3Csection>
    \x3Ch2>Slide Title\x3C/h2>
    \x3C!-- Content: .two-col, .three-col, .card, table, .code-block, .timeline -->
    \x3Caside class="notes">Speaker script here...\x3C/aside>
  \x3C/section>
\x3C/div>\x3C/div>

\x3C!-- Lightbox overlay (see template) -->
\x3C!-- reveal.js init + lightbox JS (see template) -->
\x3C/body>
\x3C/html>

Key CSS Classes (defined in template)

Class	Use
`.card`	Dark rounded content box
`.two-col` / `.three-col`	Flex column layouts
`.code-block`	Styled code with `.comment`, `.cmd`, `.flag`
`.warning-box`	Red-border alert box
`.slide-img`	Clickable image (max-height 420px)
`.flow-box` + `.flow-arrow`	Horizontal flow diagrams
`.timeline` + `.timeline-item`	Vertical timeline with colored dots
`.tag-blue/green/red/orange/purple`	Colored pill tags
`.accent/.green/.red/.blue/.purple/.yellow/.pink`	Text colors
`.checklist`	Checkbox-style list
`.photo-row`	Horizontal image gallery
`.gradient-text`	Gradient colored heading

Lightbox Feature

All images with .slide-img, .slide-img-large, .slide-img-full, or inside .photo-row are clickable. Clicking opens fullscreen lightbox with:

Left/right navigation (arrows + keyboard)
Counter (current/total)
Close button + ESC key
Pauses reveal.js keyboard while open

The lightbox JS is included in the template.

reveal.js Configuration

Reveal.initialize({
  hash: true,
  slideNumber: 'c/t',
  transition: 'slide',
  width: 1280,
  height: 720,
  margin: 0.06,
  plugins: [RevealHighlight, RevealNotes]
});

Keyboard hints shown at bottom: ← → 翻页 · S 演讲者视图 · F 全屏 · O 总览

Step 5: Speaker Scripts

Add \x3Caside class="notes"> to every content slide (not section dividers).

Script Guidelines

Length: 150-300 words per slide (1-2 minutes speaking time)
Tone: Conversational, as if speaking to audience directly
Structure: Hook → explain visual → key points → transition to next
Language: Match audience language (Chinese for Chinese audience)
Include: Analogies, real examples, audience engagement ("大家看...")
Avoid: Reading bullet points verbatim; scripts should expand on slide content

Presenter View

User presses S to open presenter view showing:

Current slide (left)
Speaker notes/script (right)
Next slide preview
Timer

Output Checklist

Before delivering, verify:

All slides have titles with emoji prefixes
All content slides have \x3Caside class="notes"> with full scripts
All slide images generated and saved to images/ directory
Images use .slide-img class (clickable lightbox)
Section dividers have colored backgrounds and section numbers
Lightbox JS is included and functional
reveal.js loads from CDN (no local dependencies)
File opens correctly in browser (open index.html)

安全使用建议

This skill is largely coherent with its stated purpose, but review these items before installing: - Expect to provide a Gemini API key (GEMINI_API_KEY) even though the metadata doesn't declare it. Ask the author to update the metadata to list this required credential. - The included script sends the API key in the URL query (?key=...), which can be logged by intermediaries. If you must use this skill, consider passing the key via the --api-key argument only at runtime or modify the script to use an Authorization header instead. - The SKILL.md references TOOLS.md for finding the API key, but no TOOLS.md is bundled — confirm where you should securely store/provide the key. - The skill will send slide prompts (user content) to Google's generative language endpoint; that means content will be transmitted to Google for image generation. Make sure this is acceptable for any sensitive material. - If you are concerned about exfiltration or logging of secrets, inspect/modify scripts locally before running, and supply API keys with least privilege and rotation where possible. If the author can (1) declare GEMINI_API_KEY in the metadata, (2) avoid sending the key in the URL, and (3) include or document the referenced TOOLS.md, the inconsistency would be resolved and the skill would be internally coherent.

功能分析

Type: OpenClaw Skill Name: ai-ppt-presenter Version: 1.0.0 The skill bundle is a legitimate tool for generating HTML presentations using reveal.js and Gemini-powered image generation. The Python script `scripts/generate_slide_images.py` correctly handles API requests to Google's official endpoints (generativelanguage.googleapis.com) and saves images locally. The HTML template `assets/reveal-template.html` uses standard, reputable CDNs for reveal.js and includes a functional lightbox script. No evidence of data exfiltration, malicious execution, or harmful prompt injection was found.

能力评估

ℹ Purpose & Capability

Functionality (outline → per-slide content → Gemini image generation → reveal.js assembly) matches the name and description. However, metadata declares no required environment variables despite the SKILL.md and included script depending on a Gemini API key (GEMINI_API_KEY) — an incoherence between claimed requirements and actual capability.

✓ Instruction Scope

SKILL.md instructions stay within the stated purpose: generating slide content, producing one image per slide via Gemini, and assembling an HTML reveal.js presentation. The instructions ask the agent to read user-supplied markdown/text when provided (expected). One problem: SKILL.md points to a TOOLS.md 'Gemini Image Generation' section for finding the API key, but no TOOLS.md is included in the bundle.

✓ Install Mechanism

No install spec is provided (instruction-only install), which is low risk. There is one included script (scripts/generate_slide_images.py) and an HTML template; both are readable and not obfuscated. Nothing is downloaded from arbitrary URLs during install.

⚠ Credentials

The skill implicitly requires a Gemini API key (GEMINI_API_KEY) to generate images, but the registry metadata lists no required env vars or primary credential — this is inconsistent and could cause surprise when the key is requested. The included Python script also places the API key in the request URL (?key=...), which is less secure (can be logged by proxies or servers). No other unrelated credentials are requested.

✓ Persistence & Privilege

The skill does not request permanent/always-on presence and uses default autonomous invocation. It does not modify other skill or agent configs and asks for no system-wide paths or elevated privileges.

版本历史

v1.0.0

AI-powered reveal.js presentation builder with word-for-word speaker scripts and presenter view.

元数据

Slug ai-ppt-presenter

版本 1.0.0

许可证 MIT-0

累计安装 1

当前安装数 1

历史版本数 1

常见问题