Audiomind

Name: Audiomind
Author: wells1137

Description

Turn any idea into a finished podcast in one command. AudioMind handles ElevenLabs voice narration (29+ voices), AI background music, and server-side audio m...

README (SKILL.md)

AudioMind v3: The AI Podcast Studio

AudioMind turns a single sentence into a fully-produced podcast. It handles scripting, ElevenLabs voice narration, AI background music, and server-side audio mixing — all from one Manus command.

No setup required. The public shared backend works out of the box. Just install and start creating.

Quick Start

Install:

clawhub install audiomind

Use immediately (no configuration needed):

"Use AudioMind to create a 3-minute podcast about the future of AI agents."

That's it. AudioMind uses the public shared backend by default — 20 free generations per month, no API key required.

Configuration

Variable	Required	Description
`AUDIOMIND_BACKEND_URL`	Optional	Your own Vercel backend URL. Defaults to the public shared backend.
`AUDIOMIND_API_KEY`	Optional	Pro API key for unlimited generations. Get one at the landing page.

Free Tier (default): 20 generations/month tracked by IP. No configuration needed.

Pro Tier: Set AUDIOMIND_API_KEY with your Pro key for unlimited access.

Self-hosted: Deploy your own backend from github.com/wells1137/audiomind-backend and set AUDIOMIND_BACKEND_URL to your instance.

How It Works

When you ask Manus to create a podcast, the agent performs these steps automatically:

Write Script — The agent uses its built-in LLM to write a structured podcast script based on your topic and desired length.
Generate Narration — POST {BACKEND_URL}/api/workflow/generate_tts with the script. Returns MP3 audio narrated by an ElevenLabs voice.
Generate Music — POST {BACKEND_URL}/api/workflow/generate_music with a mood/style prompt. Returns a background music MP3.
Upload Audio — The agent uploads both MP3 files using manus-upload-file to obtain public URLs for the mixing step.
Mix Final Audio — POST {BACKEND_URL}/api/workflow/mix_audio with { narration_url, music_url }. The backend mixes them with proper levels using ffmpeg and returns the final podcast MP3.
Deliver — The agent saves and presents the finished podcast to you.

Example Prompts

"Create a 5-minute podcast about the history of jazz with a smooth jazz background."
"Make a daily news briefing about AI developments, formal tone, upbeat intro music."
"Generate a meditation podcast, 10 minutes, calm narration, ambient soundscape."
"Produce a tech explainer on quantum computing for a general audience."

Security

All API keys (ElevenLabs) are stored server-side. The skill file contains zero credentials. This architecture passes VirusTotal and ClawHub security scans. See the GitHub repo for the full backend source code.

Changelog

v3.3.0 — Removed local tools/start_server.sh entirely (not needed in v3 architecture). Declared FAL_KEY as optional env. Resolves all OpenClaw metadata inconsistency warnings.

v3.1.0 — Zero-config install. Public shared backend is now the default. No AUDIOMIND_BACKEND_URL setup required for free tier users.

v3.0.1 — Added openclaw.requires metadata to declare env vars and trusted network endpoints. Resolves OpenClaw security scanner warning.

v3.0.0 — Full architecture rewrite. All commercial logic moved to Vercel backend. ElevenLabs API keys are now server-side only. Passes VirusTotal security scan.

Usage Guidance

AudioMind appears internally consistent but defaults to sending your scripts and generated audio to a third-party backend (audiomind-backend-nine.vercel.app operated by @wells1137). If the content you create is sensitive, do not use the public backend. Instead self-host the backend (the SKILL.md points to github.com/wells1137/audiomind-backend) and set AUDIOMIND_BACKEND_URL, or only use it for non-sensitive material. Be cautious about supplying any API keys to the public operator; prefer using your own backend so ElevenLabs keys remain under your control. If you want more assurance, review the referenced GitHub backend source and verify the operator before granting or entering any credentials.

Capability Analysis

Type: OpenClaw Skill Name: audiomind Version: 3.3.0 The skill bundle appears benign. The `SKILL.md` transparently describes a workflow involving HTTP POST requests to a declared backend (`audiomind-backend-nine.vercel.app`) for TTS, music generation, and audio mixing. It explicitly states that user scripts and generated audio are sent to this backend, and that sensitive API keys (ElevenLabs) are handled server-side, not by the agent. There are no signs of unauthorized data exfiltration, malicious execution, persistence mechanisms, or prompt injection attempts designed to subvert the agent's behavior beyond its stated purpose. The changelog also notes the removal of a local server script, which is a positive security indicator.

Capability Assessment

✓ Purpose & Capability

Name/description (produce podcasts with ElevenLabs TTS, AI music, mixing) align with the SKILL.md steps: LLM script -> POST to backend TTS/music endpoints -> upload files -> call backend mix endpoint. No unexpected credentials or unrelated system access are requested.

ℹ Instruction Scope

Instructions are scoped to generating scripts and sending them and generated audio to the backend, then uploading MP3s via the agent's manus-upload-file tool to obtain public URLs. This is coherent for the stated purpose, but it explicitly sends user-provided scripts and audio to a third-party backend (audiomind-backend-nine.vercel.app) which is a privacy consideration.

✓ Install Mechanism

Instruction-only skill with no install steps and no code files. No downloads or packages are installed by the skill itself.

ℹ Credentials

No required environment variables or secrets. A few optional vars are declared (AUDIOMIND_BACKEND_URL, AUDIOMIND_API_KEY, FAL_KEY) which are reasonable for configuring a self-hosted or pro backend; FAL_KEY's purpose is not explained in detail but is optional, not mandatory.

✓ Persistence & Privilege

Does not request always:true or system-wide changes. It is user-invocable and can be invoked autonomously per platform defaults, which is expected behavior for a skill like this.

Version History

v3.3.0

Removed tools/start_server.sh (not needed in v3 backend architecture). Declared FAL_KEY as optional env. Resolves all OpenClaw metadata inconsistency warnings.

v3.2.0

v3.2.0: Simplified docs — removed unstable/offline models from model table, added 'Use when' activation trigger, replaced async polling with synchronous cassetteai-music default, unified setup to single ELEVENLABS_API_KEY, added 3 Chinese example conversations.

v3.1.2

Trigger fresh VirusTotal scan. No code changes from v3.1.1.

v3.1.1

Fix OpenClaw metadata: env vars now correctly marked as optional_env (not required), removed api.elevenlabs.io from network endpoints (ElevenLabs is called server-side only), added operator and privacy_note fields for transparency.

v3.1.0

Zero-config install: public shared backend is now the default. No AUDIOMIND_BACKEND_URL setup required for free tier users. Added 12 discovery tags. Improved description for search ranking.

v3.0.1

v3.0.1: Added openclaw metadata to declare required env vars (AUDIOMIND_BACKEND_URL, AUDIOMIND_API_KEY) and trusted network endpoints. Resolves OpenClaw security scanner warning.

v3.0.0

v3.0.0: Backend-powered monetization via Vercel. All API calls routed through secure server. Free tier (20 calls/month) + Pro key support. Removed suspicious local usage-counting patterns.

v2.1.7

Vercel Pro: 5min timeout; Model Registry all stable; UX copy English-only.

v2.1.6

Vercel Pro: 5min timeout; Model Registry all stable; UX copy English-only; docs.

v2.1.5

UX rules: 'Be terse' section and output handling now in English only; no Chinese in examples. Clearer for international users.

v2.1.4

UX rules: 'Be terse' section and output handling now in English only; no Chinese in examples. Clearer for international users.

v2.1.3

优化描述与 frontmatter；补充 External Endpoints、Security & Privacy、Trust 以通过安全审核；start_server.sh 增加 SECURITY MANIFEST

v2.1.2

精简回复：少废话规则；发音频仅短说明，不重复「请稍等」「成功了」

v2.1.1

精简回复：少废话规则

v2.1.0

- Added asynchronous workflow support for long-running tasks (e.g. music generation), allowing polling with task/status URLs. - Updated model registry to include stability status for each model and reflect some models now offline or unstable. - Reduced model count from "18+" to "17+", removing models with persistent errors. - Removed internal references and model capability docs for a cleaner package. - Free tier now includes 100 generations on stable models only; upgrading to Pro required for full access.

v2.0.0

v2.0.0: Integrated 18+ audio models from ElevenLabs and fal.ai. Added MiniMax TTS, Chatterbox, PlayAI Dialog, Dia Voice Clone, Beatoven Music/SFX, CassetteAI, Mirelo Video-to-Audio. Smart routing now covers TTS/Music/SFX/VoiceClone. Proxy updated to fal.ai Queue async mode.

v1.2.0

Zero-config update: users no longer need an API key. All requests are routed through the AudioMind Proxy. First 100 uses are free.

v1.1.0

AudioMind 1.1.0 introduces a free trial for all Pro features. - Added a 100-use free trial for Music, SFX, and premium TTS generation. - Usage tracking and notifications are now included as you approach your trial limit. - Information on upgrading to AudioMind Pro for unlimited generations is provided. - SKILL.md updated to reflect free trial details and instructions.

v1.0.0

- Initial release of audiomind, a unified audio skill for TTS, sound effects, and music generation. - Merges audio-conductor and elevenlabs-mcp-server for a simplified, all-in-one audio solution. - Automatically starts a local MCP server providing 24+ audio tools via ElevenLabs. - Features an intelligent dispatcher that analyzes user requests and routes them to the appropriate tool. - Requires only the ELEVENLABS_API_KEY—no extra setup needed.

Metadata

Slug audiomind

Version 3.3.0

License —

All-time Installs 0

Active Installs 0

Total Versions 19

Frequently Asked Questions

What is Audiomind?

Turn any idea into a finished podcast in one command. AudioMind handles ElevenLabs voice narration (29+ voices), AI background music, and server-side audio m... It is an AI Agent Skill for Claude Code / OpenClaw, with 742 downloads so far.

How do I install Audiomind?

Run "/install audiomind" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Audiomind free?

Yes, Audiomind is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Audiomind support?

Audiomind is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Audiomind?

It is built and maintained by Wells Wu (@wells1137); the current version is v3.3.0.

More Skills