← 返回 Skills 市场

MAI Transcribe

Name: MAI Transcribe
Author: robotsbuildrobots

作者 robotsbuildrobots · GitHub ↗ · v0.1.1 · MIT-0

cross-platform ⚠ suspicious

总下载

当前安装

版本数

在 OpenClaw 中安装

/install mai-transcribe

功能描述

Transcribe audio with Microsoft's MAI-Transcribe-1 model via Azure AI Speech.

使用说明 (SKILL.md)

MAI-Transcribe-1

Transcribe an audio file via Azure AI Speech using Microsoft's MAI-Transcribe-1 model.

Quick start

node {baseDir}/scripts/transcribe.js /path/to/audio.m4a

Defaults:

Model: mai-transcribe-1
Output: \x3Cinput>.txt
API version: 2025-10-15

Useful flags

node {baseDir}/scripts/transcribe.js /path/to/audio.ogg --out /tmp/transcript.txt
node {baseDir}/scripts/transcribe.js /path/to/audio.m4a --language en-GB
node {baseDir}/scripts/transcribe.js /path/to/audio.m4a --json --out /tmp/transcript.json
node {baseDir}/scripts/transcribe.js /path/to/audio.wav --model mai-transcribe-1
node {baseDir}/scripts/transcribe.js --help

Required env vars

export AZURE_SPEECH_ENDPOINT="https://YOUR-RESOURCE.cognitiveservices.azure.com"
export AZURE_SPEECH_KEY="YOUR_SPEECH_RESOURCE_KEY"

How to get the API key

Go to the Azure portal and open your Speech or Foundry Speech resource.
Open Keys and Endpoint.
Copy:
- the resource endpoint, for example https://your-resource.cognitiveservices.azure.com
- one of the resource keys
Export them:

export AZURE_SPEECH_ENDPOINT="https://YOUR-RESOURCE.cognitiveservices.azure.com"
export AZURE_SPEECH_KEY="YOUR_SPEECH_RESOURCE_KEY"

If gh-style copy-paste chaos is happening, the most important bit is that this skill expects the Speech resource endpoint, not a generic Foundry project URL.

Optional:

export AZURE_SPEECH_API_VERSION="2025-10-15"

API shape

The script calls:

POST {AZURE_SPEECH_ENDPOINT}/speechtotext/transcriptions:transcribe?api-version=2025-10-15

Headers:

Ocp-Apim-Subscription-Key: {AZURE_SPEECH_KEY}

Multipart form fields:

audio
definition

Example definition payload:

{
  "enhancedMode": {
    "enabled": true,
    "model": "mai-transcribe-1"
  }
}

Notes

This is the same style of skill as the Whisper one: a small documented script wrapper, not a built-in OpenClaw media pipeline.
Tested successfully against a live Azure Speech resource.
--json writes the raw Azure response for debugging or downstream processing.
Audio is uploaded to Microsoft for processing.

安全使用建议

This skill is coherent and implements a straightforward transcription CLI. Before installing, confirm you are comfortable with audio being uploaded to Microsoft (the script posts audio to the Azure Speech endpoint). Provide a Speech resource key with least privilege possible and rotate/revoke the key if needed. Ensure your runtime has a compatible Node version (FormData/Blob/fetch usage may require modern Node). Avoid uploading highly sensitive recordings unless your Azure policy allows it.

功能分析

Type: OpenClaw Skill Name: mai-transcribe Version: 0.1.1 The mai-transcribe skill is a legitimate tool for transcribing audio using Microsoft's Azure AI Speech service. The implementation in scripts/transcribe.js and scripts/common.js follows standard practices, using user-provided environment variables (AZURE_SPEECH_ENDPOINT and AZURE_SPEECH_KEY) to interact with the official Azure API. No evidence of data exfiltration, malicious execution, or prompt injection was found.

能力评估

✓ Purpose & Capability

Name/description (MAI Transcribe) match the requested resources and code. The skill only asks for AZURE_SPEECH_ENDPOINT and AZURE_SPEECH_KEY, requires node, and contains a small CLI that posts audio to the documented Speech API. Nothing requested appears unrelated to transcription.

✓ Instruction Scope

SKILL.md and scripts instruct the agent to run a local Node script that reads a single audio file, uploads it to the configured AZURE_SPEECH_ENDPOINT, and writes a transcript file. The instructions do not request unrelated files, other environment variables, or unexpected external endpoints. The README and SKILL.md explicitly note that audio is uploaded to Microsoft.

✓ Install Mechanism

This is an instruction-only skill with no install spec (lowest risk). The included code files are small, documented, and use standard Node runtime behavior; there are no downloads from arbitrary URLs or extraction steps.

✓ Credentials

Required env vars are AZURE_SPEECH_ENDPOINT and AZURE_SPEECH_KEY (primaryEnv). Those are appropriate and sufficient for calling Azure Speech. No unrelated secrets or config paths are requested. An optional AZURE_SPEECH_API_VERSION is allowed for compatibility.

✓ Persistence & Privilege

always is false and the skill does not request persistent/global agent privileges or modify other skill configs. Autonomous invocation is allowed by default but is not combined with broad or unrelated credential access.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install mai-transcribe
安装完成后，直接呼叫该 Skill 的名称或使用 /mai-transcribe 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v0.1.1

Add Azure Speech key and endpoint setup instructions

v0.1.0

Initial release: skill to use MAI-Transcribe as an alternative to Whisper

元数据

Slug mai-transcribe

版本 0.1.1

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 2

常见问题

MAI Transcribe 是什么？

Transcribe audio with Microsoft's MAI-Transcribe-1 model via Azure AI Speech. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 98 次。

如何安装 MAI Transcribe？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install mai-transcribe」即可一键安装，无需额外配置。

MAI Transcribe 是免费的吗？

是的，MAI Transcribe 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

MAI Transcribe 支持哪些平台？

MAI Transcribe 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 MAI Transcribe？

由 robotsbuildrobots（@robotsbuildrobots）开发并维护，当前版本 v0.1.1。