MAI Transcribe
/install mai-transcribe
MAI-Transcribe-1
Transcribe an audio file via Azure AI Speech using Microsoft's MAI-Transcribe-1 model.
Quick start
node {baseDir}/scripts/transcribe.js /path/to/audio.m4a
Defaults:
- Model:
mai-transcribe-1 - Output:
\x3Cinput>.txt - API version:
2025-10-15
Useful flags
node {baseDir}/scripts/transcribe.js /path/to/audio.ogg --out /tmp/transcript.txt
node {baseDir}/scripts/transcribe.js /path/to/audio.m4a --language en-GB
node {baseDir}/scripts/transcribe.js /path/to/audio.m4a --json --out /tmp/transcript.json
node {baseDir}/scripts/transcribe.js /path/to/audio.wav --model mai-transcribe-1
node {baseDir}/scripts/transcribe.js --help
Required env vars
export AZURE_SPEECH_ENDPOINT="https://YOUR-RESOURCE.cognitiveservices.azure.com"
export AZURE_SPEECH_KEY="YOUR_SPEECH_RESOURCE_KEY"
How to get the API key
- Go to the Azure portal and open your Speech or Foundry Speech resource.
- Open Keys and Endpoint.
- Copy:
- the resource endpoint, for example
https://your-resource.cognitiveservices.azure.com - one of the resource keys
- the resource endpoint, for example
- Export them:
export AZURE_SPEECH_ENDPOINT="https://YOUR-RESOURCE.cognitiveservices.azure.com"
export AZURE_SPEECH_KEY="YOUR_SPEECH_RESOURCE_KEY"
If gh-style copy-paste chaos is happening, the most important bit is that this skill expects the Speech resource endpoint, not a generic Foundry project URL.
Optional:
export AZURE_SPEECH_API_VERSION="2025-10-15"
API shape
The script calls:
POST {AZURE_SPEECH_ENDPOINT}/speechtotext/transcriptions:transcribe?api-version=2025-10-15
Headers:
Ocp-Apim-Subscription-Key: {AZURE_SPEECH_KEY}
Multipart form fields:
audiodefinition
Example definition payload:
{
"enhancedMode": {
"enabled": true,
"model": "mai-transcribe-1"
}
}
Notes
- This is the same style of skill as the Whisper one: a small documented script wrapper, not a built-in OpenClaw media pipeline.
- Tested successfully against a live Azure Speech resource.
--jsonwrites the raw Azure response for debugging or downstream processing.- Audio is uploaded to Microsoft for processing.
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install mai-transcribe - After installation, invoke the skill by name or use
/mai-transcribe - Provide required inputs per the skill's parameter spec and get structured output
What is MAI Transcribe?
Transcribe audio with Microsoft's MAI-Transcribe-1 model via Azure AI Speech. It is an AI Agent Skill for Claude Code / OpenClaw, with 98 downloads so far.
How do I install MAI Transcribe?
Run "/install mai-transcribe" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is MAI Transcribe free?
Yes, MAI Transcribe is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does MAI Transcribe support?
MAI Transcribe is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created MAI Transcribe?
It is built and maintained by robotsbuildrobots (@robotsbuildrobots); the current version is v0.1.1.