← Back to Skills Marketplace
moss-transcribe-diarize
by
helloeveryworlds
· GitHub ↗
· v1.0.5
· MIT-0
364
Downloads
0
Stars
1
Active Installs
7
Versions
Install in OpenClaw
/install moss-transcribe-diarize
Description
MOSS 多说话人转写技能。支持 URL / 本地文件 / Base64 音频输入,输出带时间戳与 speaker 的结构化转写结果(JSON、逐段文本、按说话人汇总)。用于会议纪要、访谈录音、多人对话整理。需要 API 凭证(环境变量:MOSS_API_KEY,兼容 MOSI_TTS_API_KEY / MOS...
Usage Guidance
This skill appears to do exactly what it says: it uploads audio to https://studio.mosi.cn and saves structured, speaker-labelled transcripts. Before installing, verify you trust the remote service (studio.mosi.cn) because your audio and the API key are sent to that endpoint. Ensure the runtime has python3 and the 'requests' package installed (the skill does not declare a package install step). Avoid running it on highly sensitive audio unless you are comfortable with the service's privacy practices, and consider using a disposable or least-privilege API key (rotate the key later) for testing. If you need a guarantee that no external network calls occur, do not install this skill.
Capability Analysis
Type: OpenClaw Skill
Name: moss-transcribe-diarize
Version: 1.0.5
The skill is a legitimate tool for audio transcription and speaker diarization using the MOSS API. The script `scripts/transcribe.py` correctly handles local file reading, base64 encoding, and communication with the designated endpoint (studio.mosi.cn) using provided environment variables for authentication. No evidence of data exfiltration, malicious execution, or prompt injection was found.
Capability Assessment
Purpose & Capability
Name/description match the code and runtime instructions: the script uploads audio (URL, local file, or data URL) to a fixed transcription endpoint and returns structured, speaker-labelled output. Small implementation note: the script imports the Python 'requests' library but the skill only declares 'python3' as a required binary and provides no install spec for Python packages; this is an operational mismatch (not a security misalignment).
Instruction Scope
SKILL.md instructs the agent to run scripts/transcribe.py and the script only performs tasks required for transcription: read a local file (if provided), base64-encode it, POST JSON with the audio to the hard-coded endpoint, and write three output files. It does not read unrelated system files or other environment variables. Important privacy note: the script transmits the audio bytes and sends the API key in an Authorization header to https://studio.mosi.cn; that is expected behavior but users should be aware audio plus the key are sent off-host.
Install Mechanism
There is no install spec (instruction-only + included script). Nothing is downloaded or written by an installer. The only runtime requirement is python3 and the presence of the 'requests' Python package (not declared). No remote install URLs or archive extraction are used.
Credentials
The skill requires one of three API keys (MOSS_API_KEY as primary, or MOSI_TTS_API_KEY / MOSI_API_KEY as fallbacks). These map directly to the Authorization header used by the script and are proportional to the stated purpose. No unrelated credentials or secrets are requested.
Persistence & Privilege
The skill does not request always: true, and it does not modify other skills or system configuration. It creates output files in the working directory as expected for a transcription tool. Autonomous invocation remains enabled by default on the platform, but that is not a property of this skill specifically.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install moss-transcribe-diarize - After installation, invoke the skill by name or use
/moss-transcribe-diarize - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.5
Add user-selectable segment output formats (json/compact/text) with speaker fields; fix segment parsing for asr_transcription_result.segments.
v1.0.4
Declare credential metadata explicitly (primaryEnv + requires.env) to match runtime API-key usage.
v1.0.3
Simplify API surface to match moss-tts style: fixed endpoint, removed extra API params, keep minimal required arguments.
v1.0.2
Harden endpoint policy (HTTPS + studio.mosi.cn allowlist), clarify credential requirement, and reduce scanner false positives.
v1.0.1
Clarify required API credentials in metadata/description; align key env fallback; cleanup packaging.
v0.1.1
Improve reliability: add source validation, request timeout handling, HTTP/JSON checks, output path safety, and clearer error exits.
v0.1.0
Initial release: high-confidence diarized ASR workflow from docs, URL/file/base64 input support, structured outputs (JSON/segments/by-speaker).
Metadata
Frequently Asked Questions
What is moss-transcribe-diarize?
MOSS 多说话人转写技能。支持 URL / 本地文件 / Base64 音频输入,输出带时间戳与 speaker 的结构化转写结果(JSON、逐段文本、按说话人汇总)。用于会议纪要、访谈录音、多人对话整理。需要 API 凭证(环境变量:MOSS_API_KEY,兼容 MOSI_TTS_API_KEY / MOS... It is an AI Agent Skill for Claude Code / OpenClaw, with 364 downloads so far.
How do I install moss-transcribe-diarize?
Run "/install moss-transcribe-diarize" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is moss-transcribe-diarize free?
Yes, moss-transcribe-diarize is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does moss-transcribe-diarize support?
moss-transcribe-diarize is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created moss-transcribe-diarize?
It is built and maintained by helloeveryworlds (@helloeveryworlds); the current version is v1.0.5.
More Skills