← 返回 Skills 市场
MAI Transcribe
作者
robotsbuildrobots
· GitHub ↗
· v0.1.1
· MIT-0
98
总下载
0
收藏
0
当前安装
2
版本数
在 OpenClaw 中安装
/install mai-transcribe
功能描述
Transcribe audio with Microsoft's MAI-Transcribe-1 model via Azure AI Speech.
使用说明 (SKILL.md)
MAI-Transcribe-1
Transcribe an audio file via Azure AI Speech using Microsoft's MAI-Transcribe-1 model.
Quick start
node {baseDir}/scripts/transcribe.js /path/to/audio.m4a
Defaults:
- Model:
mai-transcribe-1 - Output:
\x3Cinput>.txt - API version:
2025-10-15
Useful flags
node {baseDir}/scripts/transcribe.js /path/to/audio.ogg --out /tmp/transcript.txt
node {baseDir}/scripts/transcribe.js /path/to/audio.m4a --language en-GB
node {baseDir}/scripts/transcribe.js /path/to/audio.m4a --json --out /tmp/transcript.json
node {baseDir}/scripts/transcribe.js /path/to/audio.wav --model mai-transcribe-1
node {baseDir}/scripts/transcribe.js --help
Required env vars
export AZURE_SPEECH_ENDPOINT="https://YOUR-RESOURCE.cognitiveservices.azure.com"
export AZURE_SPEECH_KEY="YOUR_SPEECH_RESOURCE_KEY"
How to get the API key
- Go to the Azure portal and open your Speech or Foundry Speech resource.
- Open Keys and Endpoint.
- Copy:
- the resource endpoint, for example
https://your-resource.cognitiveservices.azure.com - one of the resource keys
- the resource endpoint, for example
- Export them:
export AZURE_SPEECH_ENDPOINT="https://YOUR-RESOURCE.cognitiveservices.azure.com"
export AZURE_SPEECH_KEY="YOUR_SPEECH_RESOURCE_KEY"
If gh-style copy-paste chaos is happening, the most important bit is that this skill expects the Speech resource endpoint, not a generic Foundry project URL.
Optional:
export AZURE_SPEECH_API_VERSION="2025-10-15"
API shape
The script calls:
POST {AZURE_SPEECH_ENDPOINT}/speechtotext/transcriptions:transcribe?api-version=2025-10-15
Headers:
Ocp-Apim-Subscription-Key: {AZURE_SPEECH_KEY}
Multipart form fields:
audiodefinition
Example definition payload:
{
"enhancedMode": {
"enabled": true,
"model": "mai-transcribe-1"
}
}
Notes
- This is the same style of skill as the Whisper one: a small documented script wrapper, not a built-in OpenClaw media pipeline.
- Tested successfully against a live Azure Speech resource.
--jsonwrites the raw Azure response for debugging or downstream processing.- Audio is uploaded to Microsoft for processing.
安全使用建议
This skill is coherent and implements a straightforward transcription CLI. Before installing, confirm you are comfortable with audio being uploaded to Microsoft (the script posts audio to the Azure Speech endpoint). Provide a Speech resource key with least privilege possible and rotate/revoke the key if needed. Ensure your runtime has a compatible Node version (FormData/Blob/fetch usage may require modern Node). Avoid uploading highly sensitive recordings unless your Azure policy allows it.
功能分析
Type: OpenClaw Skill
Name: mai-transcribe
Version: 0.1.1
The mai-transcribe skill is a legitimate tool for transcribing audio using Microsoft's Azure AI Speech service. The implementation in scripts/transcribe.js and scripts/common.js follows standard practices, using user-provided environment variables (AZURE_SPEECH_ENDPOINT and AZURE_SPEECH_KEY) to interact with the official Azure API. No evidence of data exfiltration, malicious execution, or prompt injection was found.
能力评估
Purpose & Capability
Name/description (MAI Transcribe) match the requested resources and code. The skill only asks for AZURE_SPEECH_ENDPOINT and AZURE_SPEECH_KEY, requires node, and contains a small CLI that posts audio to the documented Speech API. Nothing requested appears unrelated to transcription.
Instruction Scope
SKILL.md and scripts instruct the agent to run a local Node script that reads a single audio file, uploads it to the configured AZURE_SPEECH_ENDPOINT, and writes a transcript file. The instructions do not request unrelated files, other environment variables, or unexpected external endpoints. The README and SKILL.md explicitly note that audio is uploaded to Microsoft.
Install Mechanism
This is an instruction-only skill with no install spec (lowest risk). The included code files are small, documented, and use standard Node runtime behavior; there are no downloads from arbitrary URLs or extraction steps.
Credentials
Required env vars are AZURE_SPEECH_ENDPOINT and AZURE_SPEECH_KEY (primaryEnv). Those are appropriate and sufficient for calling Azure Speech. No unrelated secrets or config paths are requested. An optional AZURE_SPEECH_API_VERSION is allowed for compatibility.
Persistence & Privilege
always is false and the skill does not request persistent/global agent privileges or modify other skill configs. Autonomous invocation is allowed by default but is not combined with broad or unrelated credential access.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install mai-transcribe - 安装完成后,直接呼叫该 Skill 的名称或使用
/mai-transcribe触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v0.1.1
Add Azure Speech key and endpoint setup instructions
v0.1.0
Initial release: skill to use MAI-Transcribe as an alternative to Whisper
元数据
常见问题
MAI Transcribe 是什么?
Transcribe audio with Microsoft's MAI-Transcribe-1 model via Azure AI Speech. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 98 次。
如何安装 MAI Transcribe?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install mai-transcribe」即可一键安装,无需额外配置。
MAI Transcribe 是免费的吗?
是的,MAI Transcribe 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
MAI Transcribe 支持哪些平台?
MAI Transcribe 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 MAI Transcribe?
由 robotsbuildrobots(@robotsbuildrobots)开发并维护,当前版本 v0.1.1。
推荐 Skills