Description

Yum NoteBook — a local-first NotebookLM alternative for AI agents, built for real source capture. Ingest a web URL, YouTube video, or screenshot, then genera...

README (SKILL.md)

Yum NoteBook (yumnb) — Source → Summary + Talk-show + Slides

Name: Yum NoteBook
Author: yumyumtum

A local-first NotebookLM alternative for AI agents, built for real source capture.

What It Does

Given a source (URL / YouTube / screenshot / raw text), creates one folder per request under \x3Coutput_dir>/\x3CYYYYMMDD-HHMM-slug>/ containing:

source/ — raw material (downloaded HTML, transcript, screenshot copy …)
summary.md — AI summary (one-liner / key points / facts / takeaways)
talkshow.txt + talkshow.mp3 — dual-host script + MP3 (edge-tts)
deck.pptx — slide deck (bullets, tables, flow, images, summary)
links.json — record of what was generated and any share links

This “one request = one folder” model is intentional: many such folders can accumulate into a local knowledge base that can be read, searched, narrated, and presented later.

If a webhook is configured, a notification is posted to Slack / Discord / Teams Workflow. If deliver.provider is configured, yumnb can also push the finished outputs directly to an IM/chat surface through OpenClaw / Hermes (Telegram / Discord / Teams / Slack / etc.).

You can position yumnb as a local-first, polite alternative to NotebookLM: it keeps notebooks and generated artifacts as ordinary local files by default, then only uploads or delivers them if you explicitly configure that.

Two Ways to Run

A. Fully-automatic (built-in AI provider)

Configure ai.provider in config.yaml (openai / anthropic / gemini / ollama / cli) and run:

python -m yumnb auto "\x3CURL or path>" [--title "short name"]

This runs ingest → AI summary → AI slide-plan → TTS → PPT → publish in one shot.

B. Step-by-step (agent-driven — you bring your own LLM)

If you're driving this from an agent CLI (GitHub Copilot CLI, Claude Code, Cursor, Aider, …), set ai.provider: none and call the subcommands individually. The agent reads the source, writes summary.md and deck.json itself, then asks yumnb to render TTS / PPT / publish.

# 1) Pull raw material
python -m yumnb ingest "\x3CURL>" [--title "..."]
#    → prints the note folder path

# 2) (Agent writes \x3Cfolder>/summary.md following the schema in README)

# 3) Render dual-host MP3 from a talkshow script the agent wrote
python -m yumnb tts "\x3Cfolder>/talkshow.txt" --output "\x3Cfolder>/talkshow.mp3"

# 4) Render PPT from a deck.json the agent wrote
python -m yumnb ppt "\x3Cfolder>/deck.json" --output "\x3Cfolder>/deck.pptx"

# 5) Finalize + optional webhook / IM delivery
python -m yumnb publish "\x3Cfolder>"

Schemas the Agent Writes

`summary.md`

# \x3Ctitle>

> **Source**: \x3Curl or file>
> **Type**: youtube|url|image|text
> **Length**: \x3Cduration or word count>

## 🎯 One-line summary

## 📌 Key points (3-5)

## 🔑 Facts / data

## 💡 Takeaways

## 🤔 Open questions

`deck.json` (rendered to `deck.pptx`)

{
  "title": "Deck title",
  "subtitle": "Source / date",
  "slides": [
    {"type": "title",      "title": "...", "subtitle": "..."},
    {"type": "bullets",    "title": "...", "bullets": ["...", "..."]},
    {"type": "table",      "title": "...", "headers": ["A","B"], "rows": [["1","2"]]},
    {"type": "flow",       "title": "...", "steps": ["Step 1","Step 2","Step 3"]},
    {"type": "image",      "title": "...", "image_path": "/abs/path.png", "caption": "..."},
    {"type": "two_column", "title": "...", "left": "bullet text", "image_path": "..."},
    {"type": "summary",    "title": "...", "text": "..."}
  ]
}

Recommended: 5–12 slides — title, 1-2 overview, 3-6 main (bullets/table/ flow/image), 1 summary. Reuse images from source/ (YouTube thumbnail, HTML hero image, original screenshot).

`talkshow.txt`

Lines tagged with [\x3CSpeakerName>] where \x3CSpeakerName> matches a voice configured in config.yaml → tts.voices. Example:

[HostA] Welcome to the show — today we're chewing on…
[HostB] And by chewing I mean ruthlessly mocking, right?
[HostA] Pretty much.

Prerequisites

Python 3.9+
Preferred first-run: ./scripts/bootstrap.sh
Or manual: pip install -r requirements.txt
Plus the AI SDK matching your provider (only one): openai / anthropic / google-generativeai / ollama — or none if you use provider: cli / none.

Notes

edge-tts uses Microsoft's free online voices. No API key required.
The intro/outro jingle is generated procedurally in pure Python — no external assets bundled.
YouTube ingest order is: yt-dlp manual subtitles → yt-dlp auto subtitles → youtube-transcript-api fallback → description-only fallback.
This skill carries no platform/tenant/organization-specific defaults. All endpoints and credentials come from config.yaml or environment variables (YUMNB_*, OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.).
Direct IM delivery is channel-agnostic: configure deliver.provider: openclaw (or hermes) plus deliver.openclaw.channel + target to send the finished note to Telegram, Discord, Teams, Slack, and other supported surfaces via the local bridge.

Language

config.yaml → language (default en). Influences both AI prompt language and the default TTS voice pair. Override per-run with --language en|zh|ja|es|fr|de|....

Built-in default voice pairs (male + female, edge-tts):

`language`	HostA (male)	HostB (female)
`en`	`en-US-AndrewNeural`	`en-US-AvaNeural`
`zh`	`zh-CN-YunyangNeural` 云飞	`zh-CN-XiaoxiaoNeural` 小晓
`ja`	`ja-JP-KeitaNeural`	`ja-JP-NanamiNeural`
`es`	`es-ES-AlvaroNeural`	`es-ES-ElviraNeural`
`fr`	`fr-FR-HenriNeural`	`fr-FR-DeniseNeural`
`de`	`de-DE-ConradNeural`	`de-DE-KatjaNeural`

Override any pair (or add new languages) under tts.language_voices. Setting tts.voices directly always wins.

Cloud upload (OneDrive / Google Drive / S3 / Dropbox / …)

config.yaml → upload.provider: rclone makes publish upload the generated mp3 / pptx / summary to your configured cloud and inline the shareable URLs in links.json and the notification payload — so users get one-click mp3 + ppt links instead of local file:// URIs.

upload:
  provider: rclone
  rclone:
    remote: "onedrive:yumnb"   # or gdrive:yumnb, s3:bucket/yumnb, etc.
    share: true

Set it up once with rclone config (see https://rclone.org). yumnb delegates everything to rclone so the same skill works with every backend rclone supports.

Usage Guidance

Before installing, review config.yaml and leave ai.provider, upload.provider, notify.webhook_url, and deliver.provider disabled unless you trust the destination. Do not use provider: cli, --fetcher, custom OpenClaw binaries, rclone extras, or a OneDrive uploader path unless you trust the exact command or file. Disable TTS for confidential notes, since edge-tts sends narration text to Microsoft. Prefer pinning dependencies with a lockfile for repeatable installs.

Capability Tags

cryptorequires-sensitive-credentials

Capability Assessment

ℹ Purpose & Capability

The artifacts are coherent: the skill ingests user-supplied URLs, YouTube links, screenshots, files, or text, writes a local notebook folder, and can generate summaries, audio, slides, links, uploads, notifications, and chat delivery. These capabilities match the stated NotebookLM-style purpose.

ℹ Instruction Scope

The README, SKILL.md, and config example disclose most sensitive flows, including external AI providers, Microsoft edge-tts, rclone/cloud upload, webhooks, OpenClaw/Hermes delivery, custom fetchers, and a CLI AI provider. The broad CLI/fetcher hooks are powerful but user-configured rather than hidden or automatic.

ℹ Install Mechanism

Installation is a normal user-invoked Python virtualenv/bootstrap flow, but requirements use lower-bound dependency constraints rather than pinned versions or a lockfile.

ℹ Credentials

Network, local file, subprocess, and credential use are proportionate to the advertised source-capture and publishing workflow, with local-first defaults for AI, upload, notify, and delivery; users should still treat enabled providers as data recipients.

✓ Persistence & Privilege

The skill persists generated notebooks under the configured output directory and does not install background services or privilege escalation mechanisms. Optional cloud/chat delivery and configured local commands run only when selected by config or command-line options.

Version History

v0.1.1

Patch release: local-first NotebookLM alternative tagline, clearer user flow, stronger source extraction messaging, README polish.

v0.1.0

First public release: packaged modules, smoke tests, OpenClaw/Hermes delivery, local-first NotebookLM alternative.

Metadata

Slug yumnb

Version 0.1.1

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 2

Frequently Asked Questions

What is Yum NoteBook?

Yum NoteBook — a local-first NotebookLM alternative for AI agents, built for real source capture. Ingest a web URL, YouTube video, or screenshot, then genera... It is an AI Agent Skill for Claude Code / OpenClaw, with 45 downloads so far.

How do I install Yum NoteBook?

Run "/install yumnb" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Yum NoteBook free?

Yes, Yum NoteBook is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Yum NoteBook support?

Yum NoteBook is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Yum NoteBook?

It is built and maintained by TommyYanPS (@yumyumtum); the current version is v0.1.1.

More Skills

Yum NoteBook