Podcast Generation with Microsoft Foundry

Name: Podcast Generation with Microsoft Foundry
Author: thegovind

功能描述

Generate AI-powered podcast-style audio narratives using Azure OpenAI's GPT Realtime Mini model via WebSocket. Use when building text-to-speech features, audio narrative generation, podcast creation from content, or integrating with Azure OpenAI Realtime API for real audio output. Covers full-stack implementation from React frontend to Python FastAPI backend with WebSocket streaming.

安全使用建议

This skill's code and documentation require an Azure OpenAI realtime API key, endpoint, and deployment name and also include example code that reads from and writes to a database — but the registry entry does not declare those requirements. Before installing: (1) ask the publisher to update metadata to list required env vars (AZURE_OPENAI_AUDIO_API_KEY, AZURE_OPENAI_AUDIO_ENDPOINT, AZURE_OPENAI_AUDIO_DEPLOYMENT) and any DB/config needs; (2) verify where secrets will be stored and supplied (do not paste production Azure keys into unknown skills); (3) review / audit any database integration code and ensure least-privilege credentials (read-only where possible) and proper access controls; (4) run the skill in an isolated/test environment first and rotate keys after verification; and (5) if you cannot obtain clear metadata or a trusted publisher, treat the skill as untrusted and do not install it with production credentials.

功能分析

Type: OpenClaw Skill Name: podcast-generation Version: 0.1.0 The skill bundle is benign. It provides a clear, well-documented implementation for generating AI-powered podcast-style audio using Azure OpenAI's Realtime API. The `SKILL.md` and other documentation files are descriptive and do not contain any prompt injection attempts against the OpenClaw agent. The Python script `scripts/pcm_to_wav.py` performs a standard audio format conversion. All code examples and instructions are consistent with the stated purpose, involving legitimate API interactions and local data processing without any evidence of data exfiltration to unauthorized endpoints, malicious execution, or persistence mechanisms.

能力评估

⚠ Purpose & Capability

The skill's name/description (podcast audio via Azure OpenAI Realtime) matches the implementation. However, the SKILL.md and referenced code expect AZURE_OPENAI_AUDIO_API_KEY, AZURE_OPENAI_AUDIO_ENDPOINT, and AZURE_OPENAI_AUDIO_DEPLOYMENT (and a settings object) while the registry metadata lists no required environment variables or config paths. That discrepancy is significant: consumers cannot see from metadata that sensitive credentials are needed.

⚠ Instruction Scope

The runtime instructions focus on WebSocket to Azure, streaming audio chunks, and PCM→WAV conversion — all within the declared purpose. But the references and example service code also describe fetching content from a database (tags/bookmarks), building prompts from user data, saving audio to a DB, and exposing streaming endpoints. Those broader actions (DB reads/writes and content aggregation) are not surfaced in the skill metadata and expand the scope beyond mere TTS conversion.

✓ Install Mechanism

No install specification is provided (instruction-only + a small utility script). Nothing is downloaded or written to disk by an install step, which lowers supply-chain risk.

⚠ Credentials

The SKILL.md explicitly requires AZURE_OPENAI_AUDIO_API_KEY, AZURE_OPENAI_AUDIO_ENDPOINT, and AZURE_OPENAI_AUDIO_DEPLOYMENT, but the package metadata declared no required environment variables or config paths. Additionally, the code examples assume access to application settings and a database (db operations and settings.*), which would require additional credentials (DB connection string, possibly cloud storage). The absence of these declarations is disproportionate and opaque.

✓ Persistence & Privilege

The skill does not request always:true and contains no install hooks or instructions to modify other skills or system-wide config. It does describe saving audio artifacts to a database in examples, which is normal for this application but should be explicitly declared to users.

版本历史

v0.1.0

Initial release of podcast-generation skill. - Generate podcast-style audio narratives from text using Azure OpenAI's GPT Realtime Mini via WebSocket. - Full-stack example: React frontend for audio playback, Python FastAPI backend for real-time audio synthesis. - Streams audio and transcript events; outputs base64-encoded WAV audio for immediate playback. - Includes environment, API usage, and PCM-to-WAV conversion instructions. - Supports multiple narrator voice options and provides production workflow references.

元数据

Slug podcast-generation

版本 0.1.0

许可证 —

累计安装 8

当前安装数 8

历史版本数 1

常见问题

Podcast Generation with Microsoft Foundry 是什么？

Generate AI-powered podcast-style audio narratives using Azure OpenAI's GPT Realtime Mini model via WebSocket. Use when building text-to-speech features, audio narrative generation, podcast creation from content, or integrating with Azure OpenAI Realtime API for real audio output. Covers full-stack implementation from React frontend to Python FastAPI backend with WebSocket streaming. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 2533 次。

如何安装 Podcast Generation with Microsoft Foundry？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install podcast-generation」即可一键安装，无需额外配置。

Podcast Generation with Microsoft Foundry 是免费的吗？

是的，Podcast Generation with Microsoft Foundry 完全免费（开源免费），可自由下载、安装和使用。

Podcast Generation with Microsoft Foundry 支持哪些平台？

Podcast Generation with Microsoft Foundry 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Podcast Generation with Microsoft Foundry？

由 thegovind（@thegovind）开发并维护，当前版本 v0.1.0。

Podcast Generation with Microsoft Foundry 是什么？

如何安装 Podcast Generation with Microsoft Foundry？

Podcast Generation with Microsoft Foundry 是免费的吗？

Podcast Generation with Microsoft Foundry 支持哪些平台？

谁开发了 Podcast Generation with Microsoft Foundry？

💬 留言讨论