← Back to Skills Marketplace
Pronunciation Coach
by
Crazybuffon
· GitHub ↗
· v1.0.4
622
Downloads
0
Stars
1
Active Installs
5
Versions
Install in OpenClaw
/install pronunciation-coach
Description
Pronunciation coaching with real voice analysis using Azure Speech Services. Analyzes audio files for phoneme-level accuracy, fluency, prosody, and intonatio...
Usage Guidance
This skill appears to do what it claims: it will read audio files from ~/.openclaw/media/inbound/, convert them (ffmpeg), and upload them to Microsoft Azure Speech for pronunciation assessment. Before installing: 1) Confirm you are comfortable sending users' audio to Microsoft (privacy and billing/usage matters). 2) Provide an Azure Speech key and region via AZURE_SPEECH_KEY and AZURE_SPEECH_REGION. 3) Ensure ffmpeg and Node.js are available in the agent environment. 4) Note the SKILL.md suggests sending results back to users (text and TTS) but the skill does not implement Telegram messaging or TTS — you will need the agent or other skills to have those permissions. 5) Fix the registry metadata mismatch (it should declare required env vars) and verify the skill's source/homepage if provenance matters. If you need stronger assurance, review the scripts locally or run them in a sandboxed environment before granting access to real user audio or credentials.
Capability Analysis
Type: OpenClaw Skill
Name: pronunciation-coach
Version: 1.0.4
The skill is benign, transparently declaring its purpose to analyze audio using Azure Speech Services. It explicitly requests read access to `~/.openclaw/media/inbound/` and network access to `*.stt.speech.microsoft.com` in `skill.json`. The `pronunciation-assess.sh` script demonstrates good security practices by implementing checks against filename option injection (`case "$AUDIO_FILE" in -*)`, `ffmpeg -i -- "$AUDIO_FILE"`) and sanitizing the `REFERENCE_TEXT` to prevent JSON injection before sending it to Azure. The `SKILL.md` instructions guide the agent to perform its stated function without any evidence of prompt injection attempts to subvert its behavior or access unauthorized data.
Capability Assessment
Purpose & Capability
The name, description, SKILL.md, scripts, and skill.json consistently describe using Azure Speech for pronunciation assessment and reading voice messages from ~/.openclaw/media/inbound/. This matches the capability. However, the top-level registry summary included with the evaluation stated 'Required env vars: none' while SKILL.md and skill.json clearly declare AZURE_SPEECH_KEY and AZURE_SPEECH_REGION as required — a metadata inconsistency that should be corrected.
Instruction Scope
The runtime instructions are narrowly scoped: locate latest .ogg files in ~/.openclaw/media/inbound/, convert to WAV via ffmpeg, call Azure Speech, and produce a human-readable report. These actions are consistent with the stated purpose. Notes: the SKILL.md instructs the agent to 'send a voice message (via TTS) demonstrating the correct pronunciation' and to 'send the text report to the user' but provides no code to perform Telegram messaging or TTS; those actions require the agent to have separate messaging/TTS capabilities or permissions not included in the skill files.
Install Mechanism
No install spec is provided (instruction- and script-only). This is low-risk from an installation perspective, but scripts will be executed directly by the agent environment and depend on ffmpeg and Node.js being present on PATH.
Credentials
Only Azure Speech credentials (AZURE_SPEECH_KEY, AZURE_SPEECH_REGION) are required by the scripts and skill.json; this is proportionate to the declared function. The earlier registry metadata that listed no required env vars is inconsistent with the skill's own manifest and SKILL.md. No other unrelated secrets are requested.
Persistence & Privilege
The skill does not request always:true and does not modify other skills or system settings. skill.json declares read permission for ~/.openclaw/media/inbound/ and outbound network access to *.stt.speech.microsoft.com, which are consistent with its behavior. Autonomous invocation is permitted (platform default) but not combined with other high-risk factors here.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install pronunciation-coach - After installation, invoke the skill by name or use
/pronunciation-coach - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.4
- Removed support for passing Azure Speech API key and region directly via command line; now requires environment variables.
- Updated SKILL.md to reflect simplified usage and prerequisite instructions.
- scripts/pronunciation-assess.sh no longer accepts key/region as optional arguments.
v1.0.3
- Minor internal changes for maintainability; no user-facing feature changes.
- Documentation content remains unchanged.
v1.0.2
- Updated skill.json with no changes to functionality or documentation.
- No user-facing or workflow changes in this release.
v1.0.1
- Added privacy note specifying that voice messages are transmitted to Azure for analysis.
- Declared environment variables in the skill metadata for easier configuration.
- Streamlined skill description to focus on core functionality.
- No changes to workflows or usage instructions.
v1.0.0
Initial release of Pronunciation Coach – provides actionable English pronunciation feedback using Azure Speech Services.
- Analyzes user voice messages for pronunciation, fluency, prosody, and intonation.
- Offers detailed reports with overall and word-level scores, highlighting problem sounds.
- Supplies specific coaching tips and improvement feedback based on assessment results.
- Generates practice exercises and demonstrations for targeted accent and pronunciation improvement.
Metadata
Frequently Asked Questions
What is Pronunciation Coach?
Pronunciation coaching with real voice analysis using Azure Speech Services. Analyzes audio files for phoneme-level accuracy, fluency, prosody, and intonatio... It is an AI Agent Skill for Claude Code / OpenClaw, with 622 downloads so far.
How do I install Pronunciation Coach?
Run "/install pronunciation-coach" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Pronunciation Coach free?
Yes, Pronunciation Coach is completely free (open-source). You can download, install and use it at no cost.
Which platforms does Pronunciation Coach support?
Pronunciation Coach is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Pronunciation Coach?
It is built and maintained by Crazybuffon (@crazybuffon); the current version is v1.0.4.
More Skills