Description

Discover, research, script, fact-check, and generate podcast episodes automatically. Multi-source topic discovery, LLM script generation, citation enforcemen...

README (SKILL.md)

Podcast Discovery & Generation

Name: Custom Podcast Discovery & Generation
Author: harshilmathur

Automated end-to-end podcast production pipeline. Discovers trending topics from configurable sources, researches them deeply, generates fact-checked scripts with citations, and produces audio via ElevenLabs TTS.

Triggers

Use this skill when user asks to:

"Generate a podcast"
"Make a podcast episode"
"Discover podcast topics"
"Create an audio episode about X"
"Find topics for podcast"
"Research and script a podcast"
"Produce a podcast episode"

Quick Start

1. Configure

cd ~/.openclaw/skills/podcast
cp config.example.yaml config.yaml
# Edit config.yaml: add sources, interests, voice, storage

2. Discover Topics

python3 scripts/discover.py --config config.yaml --limit 10

3. Run Pipeline

python3 scripts/pipeline.py --config config.yaml --topic "Your Topic" --mode manual

Configuration

Minimal config.yaml:

sources:
  - type: rss
    url: https://aeon.co/feed.rss
    name: Aeon
  - type: hackernews
    min_points: 200

interests:
  - AI/Tech
  - Science

voice:
  voice_id: "\x3Cyour-voice-id>"

storage:
  type: local
  path: ./output

Storage options:

type: s3 — Upload to S3 (requires bucket, region)
type: local — Save to local directory

Pipeline Stages

Discovery — Fetch and rank topics from sources
Research — Web search framework (OpenClaw worker populates)
Script — Generate script with LLM, enforce [Source: URL] citations
Verify — Cross-check claims against research sources
Audio — Strip citations, call ElevenLabs TTS
Upload — Save to S3 or local storage

Each stage can run standalone or as full pipeline.

Usage Examples

Discover only:

python3 scripts/discover.py --config config.yaml --limit 5 --output topics.json

Full pipeline (auto mode):

python3 scripts/pipeline.py --config config.yaml --mode auto

Specific topic:

python3 scripts/pipeline.py --config config.yaml --topic "AI Reasoning" --mode manual

Resume from stage:

python3 scripts/pipeline.py --config config.yaml --resume-from audio

Source Types

Built-in:

rss — Generic RSS/Atom feed (any URL)
hackernews — HN API with point/comment filters
nature — Nature journal (sections: news, research, biotech, medicine)

Add custom RSS:

sources:
  - type: rss
    url: https://yourfeed.com/rss
    name: Your Source
    category: Your Category

Output Files

output/
├── discovery-YYYY-MM-DD.json      # Ranked topics
├── research-YYYY-MM-DD-slug.json  # Research data
├── script-YYYY-MM-DD-slug.txt     # Script with citations
├── verification-YYYY-MM-DD.json   # Fact-check report
├── tts-ready-YYYY-MM-DD-slug.txt  # Clean text for TTS
├── episode-YYYY-MM-DD-slug.mp3    # Final audio
└── pipeline-state-YYYY-MM-DD.json # Pipeline state

Integration with OpenClaw

For discovery: Run directly (no tools needed)

For full pipeline: Spawn OpenClaw worker with:

web_search() — Research stage
LLM access — Script generation (Claude Sonnet recommended)
elevenlabs_text_to_speech — Audio generation

Worker pattern:

cd ~/.openclaw/skills/podcast
# Source environment if available
[ -f ~/.openclaw/env-init.sh ] && source ~/.openclaw/env-init.sh
python3 scripts/pipeline.py --config config.yaml --mode auto

Citation Enforcement

Every factual claim in scripts MUST have [Source: URL] citation:

✅ Correct:

The market grew to $10.2 billion in 2025 [Source: https://example.com/report].

❌ Incorrect:

The market grew significantly.

The verify script cross-references citations against research sources and blocks audio generation if unverified claims are found.

Cron Integration

Daily discovery (8 AM):

schedule: "0 8 * * *"
payload: |
  cd ~/.openclaw/skills/podcast
  python3 scripts/discover.py --config config.yaml --limit 10 \
    --output data/discovery-$(date +%Y-%m-%d).json

Weekly full pipeline:

schedule: "0 9 * * 1"
payload: |
  cd ~/.openclaw/skills/podcast
  [ -f ~/.openclaw/env-init.sh ] && source ~/.openclaw/env-init.sh
  python3 scripts/pipeline.py --config config.yaml --mode auto

Key Features

✅ Zero vendor lock-in — Use any RSS feed, any storage ✅ No external dependencies — Pure Python stdlib (except ElevenLabs for TTS) ✅ Citation enforcement — Every claim must have source ✅ Fact verification — Cross-check against research ✅ Pluggable sources — Easy to add new topic sources ✅ Resume support — Restart from any stage ✅ Manual or auto — Review each stage or run end-to-end

Troubleshooting

No topics found:

Check RSS URLs are valid
Verify interests match source content
Lower min_points for Hacker News

Verification fails:

Ensure research.json has sources
Check script has [Source: URL] after claims
URLs must match research sources

S3 upload fails:

Verify AWS credentials
Check bucket exists and region matches
Ensure bucket policy allows public read

Files

SKILL.md — This file
README.md — Detailed documentation
config.example.yaml — Configuration template
scripts/ — Pipeline scripts
sources/ — Source implementations
templates/ — Prompt templates

License

MIT — Open source, community-maintained OpenClaw skill

Usage Guidance

This skill appears to implement what it says, but review a few points before installing: 1) Credentials: the pipeline expects ElevenLabs TTS integration (or API key via OpenClaw worker) and optional AWS CLI credentials for S3 uploads — ensure you provide only the credentials you intend and that they are stored/configured safely (aws configure or OpenClaw credentials). 2) Worker privileges: research/script/audio stages are intended to be completed by OpenClaw workers that have web_search(), LLM, and elevenlabs_text_to_speech access — confirm what those workers can access in your environment. 3) Inspect upload.py and any delivery code: ensure no unexpected external endpoints or hardcoded credentials are present (the manifest claims none, but verify the upload implementation and any omitted files). 4) The YAML parsing and RSS parsing use simple regex logic (no pyyaml/feedparser/requests) — this is brittle and may mis-parse malicious or malformed feeds; run in a restricted environment or container and validate feed URLs you add. 5) The pipeline uses subprocess.run with argument lists (no shell=True) which is good, but the pipeline executes many scripts based on user-provided config paths and filenames; avoid running it with configs from untrusted sources. 6) Test in a sandbox (or read-only environment) first, and lock down S3 bucket policies before enabling uploads. If you want higher assurance, request the missing files (upload.py, verify.py contents were omitted in the provided listing) and I can re-check those specifically.

Capability Analysis

Type: OpenClaw Skill Name: custom-podcast-discovery Version: 1.0.1 The skill provides a legitimate and well-documented automated podcast generation pipeline, including topic discovery, research, and audio production. Security analysis of 'scripts/upload.py' and 'scripts/pipeline.py' shows safe usage of the subprocess module (passing arguments as lists and avoiding shell=True), with explicit validation of S3 configuration parameters to prevent command injection. While the pipeline is theoretically vulnerable to indirect prompt injection if a configured RSS feed contains malicious instructions, there is no evidence of intentional malice, data exfiltration, or unauthorized persistence mechanisms.

Capability Assessment

✓ Purpose & Capability

Name/description align with included scripts (discover, research framework, script generation prompt creation, verification scaffolding, TTS prep, upload). The skill legitimately needs sources, research, LLM and a TTS tool (ElevenLabs) and optional S3 for storage — which the docs and scripts reference.

ℹ Instruction Scope

SKILL.md and scripts instruct the agent to run the local Python pipeline and to spawn OpenClaw workers that call web_search(), an LLM, and elevenlabs_text_to_speech. The scripts themselves are mostly frameworks/placeholders that expect worker tools to perform network/LLM/TTS calls; they do not perform direct secret harvesting. However the pipeline and cron examples instruct sourcing ~/.openclaw/env-init.sh and invoking aws CLI (for S3) — which means runtime will rely on environment/config outside the skill. The YAML parsing is custom/regex-based (brittle) and the docs rely on the worker having broad web-search/LLM/TTS access.

✓ Install Mechanism

No install spec / no remote downloads are present in the manifest. The repository includes Python scripts and README/DEPLOYMENT docs. There is no automatic remote code fetch or archive extraction in the install metadata.

ℹ Credentials

skill.json/manifest list no required environment variables, but runtime clearly requires: (a) ElevenLabs API credentials or OpenClaw ElevenLabs tool integration to generate audio, and (b) AWS credentials or aws CLI config for S3 uploads if that storage option is used. Those credentials are referenced in docs but not declared as required in the manifest — this is an operational omission you should be aware of.

✓ Persistence & Privilege

always is false. The skill doesn't request persistent/always-on privileges and does not modify other skills. It runs as user-invoked or via normal autonomous worker invocation; no elevated system privileges requested in code or docs.

Version History

v1.0.1

Initial release: 18 RSS/API sources, Python pipeline (discover→research→script→verify→audio→upload), ElevenLabs TTS, S3/local storage, citation enforcement, zero external dependencies

Metadata

Slug custom-podcast-discovery

Version 1.0.1

License —

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Custom Podcast Discovery & Generation?

Discover, research, script, fact-check, and generate podcast episodes automatically. Multi-source topic discovery, LLM script generation, citation enforcemen... It is an AI Agent Skill for Claude Code / OpenClaw, with 322 downloads so far.

How do I install Custom Podcast Discovery & Generation?

Run "/install custom-podcast-discovery" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Custom Podcast Discovery & Generation free?

Yes, Custom Podcast Discovery & Generation is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Custom Podcast Discovery & Generation support?

Custom Podcast Discovery & Generation is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Custom Podcast Discovery & Generation?

It is built and maintained by Harshil Mathur (@harshilmathur); the current version is v1.0.1.

More Skills

Custom Podcast Discovery & Generation