← Back to Skills Marketplace
5929
Downloads
4
Stars
38
Active Installs
13
Versions
Install in OpenClaw
/install audio-cog
Description
AI audio generation and text-to-speech powered by CellCog. Voiceover, narration, voice cloning, avatar voices, sound effects, music, podcasts, dialogue. Thre...
Usage Guidance
Install this skill only if you trust CellCog and are comfortable sending prompts and audio-generation requests to its service. Use a controlled API key, monitor usage, avoid submitting secrets, and use cloned voices only with consent and clear disclosure.
Capability Analysis
Type: OpenClaw Skill
Name: audio-cog
Version: 1.0.12
The audio-cog skill provides documentation and usage instructions for an AI audio generation service via the CellCog SDK. The SKILL.md file outlines legitimate features such as text-to-speech, voice cloning, and music generation using providers like OpenAI and ElevenLabs. There are no indicators of malicious intent, data exfiltration, or harmful prompt injection; the instructions are strictly aligned with the stated purpose of professional audio production.
Capability Assessment
Purpose & Capability
The stated purpose and documented capabilities align: AI narration, sound effects, music, and cloned/avatar voices. Voice cloning is disclosed, but users should treat it as a sensitive capability.
Instruction Scope
The skill provides SDK usage snippets and delegates full operational details to the separate CellCog skill/service. This is purpose-aligned, but users should understand what data is sent and how tasks are managed.
Install Mechanism
There is no install script or code file, but the SKILL.md declares a CellCog dependency. Users should verify the CellCog package/skill source if their environment installs or invokes it.
Credentials
The CELLCOG_API_KEY requirement is expected for a CellCog API integration, and the artifacts do not show hardcoded keys or credential leakage.
Persistence & Privilege
The OpenClaw example uses a fire-and-forget remote chat task, so jobs may continue asynchronously after invocation; this is disclosed and consistent with media-generation workflows.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install audio-cog - After installation, invoke the skill by name or use
/audio-cog - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.12
- Added environment requirements for the skill: now specifies needed binaries (python3) and environment variables (CELLCOG_API_KEY) in metadata.
- No user-facing feature or functionality changes; documentation in SKILL.md updated to reflect setup prerequisites.
v1.0.11
- Updated documentation for clarity and accuracy in SKILL.md
- Improved and expanded description, highlighting support for podcasts and dialogue
- Clarified agent usage: now specifies “all agents except OpenClaw” for blocking chat integration example
- Refined instructions and language throughout for easier onboarding and provider selection
- No code or functional changes—documentation update only
v1.0.10
- Improved documentation in SKILL.md with more concise descriptions and clearer usage instructions.
- Expanded SDK usage code sample to explicitly show client initialization.
- Updated skill description to highlight avatar voices and music generation up to 10 minutes.
- Enhanced formatting and clarified steps for using voice providers, avatars, sound effects, and music features.
v1.0.9
- Simplified and clarified the description to emphasize major features and use cases.
- Reorganized usage instructions for easier onboarding, highlighting SDK references up front.
- Added new "If CellCog is not installed" section with agent-specific installation guidance.
- Streamlined sections for voice providers and capabilities; removed duplicative wording.
- Made instructions for different agent types (OpenClaw, Cursor, etc.) more concise and highlighted code output.
- Shortened and focused provider feature explanations for improved readability.
v1.0.8
- Expanded SKILL.md with detailed guidance on provider selection (OpenAI, ElevenLabs, MiniMax) and their strengths.
- Added new tables outlining voice provider scenarios, voice options, customization tips, and emotion tag usage.
- Provided explicit examples for avatar/cloned voices and usage of custom avatars for personalized narration.
- Enhanced sections on sound effects and music generation, including sample prompts and best practice tips.
- Clarified multi-language support and practical agent usage for audio generation.
- Included a new "Tips for Better Audio" section to help users optimize results with the skill.
v1.0.7
- Expanded description and usage details for TTS, music, SFX, and podcast production features
- Clarified role of supported providers: OpenAI, ElevenLabs, and MiniMax, with highlights for multi-voice and avatar voices
- Documented new features: multi-voice dialogue, podcast pipeline, 160+ voices, and output in MP3/WAV
- Updated related skills section for audio, music, podcast, and video generation
- Refined and reorganized documentation for easier discovery of key capabilities
v1.0.6
audio-cog 1.0.6
- Added explicit Python code examples for agent-based and blocking audio generation using the SDK.
- Clarified that OpenClaw agent mode is recommended for long tasks and provided notify_session_key details.
- Referenced the main cellcog skill for advanced SDK usage, delivery modes, and file handling.
- Documentation updates only; no functional or interface changes.
v1.0.5
- Added OS compatibility metadata for Darwin, Linux, and Windows.
- Updated skill description for improved clarity and SEO.
- Added homepage link to CellCog website.
- Improved formatting and metadata structure in SKILL.md.
- No changes to core functionality; documentation and metadata only.
v1.0.4
- Adds support for three voice providers: OpenAI, ElevenLabs, and MiniMax, each with unique capabilities.
- Introduces avatar/cloned voice generation via MiniMax, allowing users to create audio in their own voice.
- Expands features to include standalone sound effects (up to 30 seconds) and longer music generation (up to 10 minutes).
- Clarifies provider recommendations by scenario, with detailed guidance on emotional tags (ElevenLabs) and fine-grained controls (MiniMax).
- Updates documentation with new usage examples, usage tips, and multi-language support across all providers.
v1.0.3
- Added author and dependencies fields to SKILL.md for clearer metadata.
- Updated prerequisite instructions to refer to the cellcog skill by name.
- Minor edits for clarity and consistency in setup and usage guidance.
v1.0.2
audio-cog 1.0.2
- Added detailed documentation of all 8 available CellCog voices, including usage recommendations and voice characteristics.
- Included guidance on choosing voices by content type and how to customize styles (accent, emotion, pacing, etc.).
- Added music licensing statement: all generated music is royalty-free and usable for any commercial purpose.
- Expanded multi-language support list and updated multi-language example prompts.
- Updated example prompts and tips to reflect voice selection and new usage patterns.
v1.0.1
- Adds metadata (including an emoji) for enhanced identification.
- Updates quick-start usage pattern to use `create_chat` with simplified, fire-and-forget execution and notification model (v1.0+).
- Clearly standardizes on `chat_mode="agent"` as optimal for all audio tasks, deprecating previous agent team recommendations.
- Updates guidance to reflect new workflow and best practices.
- No change in feature set or audio capabilities.
v1.0.0
- Initial release of audio-cog: Professional AI audio generation powered by CellCog.
- Supports text-to-speech, voice synthesis, narration, voiceovers, podcast production, music creation, and sound design.
- Offers voice customization (gender, age, emotion, accent, pacing, tone).
- Enables music and background audio generation with detailed control (genre, tempo, mood, instruments, duration).
- Multi-language speech generation and various audio output formats.
- Includes prompt examples, guidance for agent team mode, and detailed usage tips.
Metadata
Frequently Asked Questions
What is Audio Cog?
AI audio generation and text-to-speech powered by CellCog. Voiceover, narration, voice cloning, avatar voices, sound effects, music, podcasts, dialogue. Thre... It is an AI Agent Skill for Claude Code / OpenClaw, with 5929 downloads so far.
How do I install Audio Cog?
Run "/install audio-cog" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Audio Cog free?
Yes, Audio Cog is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does Audio Cog support?
Audio Cog is cross-platform and runs anywhere OpenClaw / Claude Code is available (darwin, linux, windows).
Who created Audio Cog?
It is built and maintained by CellCog (@nitishgargiitd); the current version is v1.0.12.
More Skills