/install audio-transcribe-summarize
Audio/Video Transcription & Summarization
Transcribe audio/video files using the SenseASR API (api.senseaudio.cn), then summarize the content into structured notes.
{baseDir} refers to this skill's directory.
Prerequisites
- Environment variable
SENSEAUDIO_API_KEYconfigured (get your key at https://senseaudio.cn/platform/api-key) - Python 3.8+ with
requestsinstalled - For large files (>10MB):
ffmpeginstalled for splitting(macOS:brew install ffmpeg,Windows: ffmpeg.org 下载并加入 PATH,Linux:apt install ffmpeg)
Quick Start
- Run the transcription script:
python {baseDir}/scripts/transcribe.py \x3Caudio_file> [--model sense-asr-pro] [--language zh] [--speakers] [--sentiment] [--translate en]
- The script outputs a transcript
.txtfile alongside the source file - Read the transcript and generate a summary (see Summary Format below)
Workflow
Step 1: Assess the Audio File
Check file size and format:
- Supported formats: wav, mp3, ogg, flac, aac, m4a, mp4
- Max file size per request: 10MB
- If file > 10MB, the script auto-splits using ffmpeg
Step 2: Choose the Right Model
| Model | Use When |
|---|---|
sense-asr-lite |
Quick batch transcription, simple audio, cost-sensitive |
sense-asr |
General transcription, need speaker separation or timestamps |
sense-asr-pro |
High accuracy needed: meetings, interviews, complex audio |
sense-asr-deepthink |
Noisy audio, dialects, heavy jargon, speech-to-clean-text |
Default to sense-asr-pro for best quality.
Step 3: Transcribe
Run the transcription script. Key options:
# Basic transcription
python {baseDir}/scripts/transcribe.py recording.mp3
# Meeting with multiple speakers + emotion
python {baseDir}/scripts/transcribe.py meeting.wav \
--model sense-asr-pro \
--speakers --max-speakers 4 \
--sentiment \
--timestamps segment
# Transcribe and translate to English
python {baseDir}/scripts/transcribe.py lecture.mp3 \
--model sense-asr \
--translate en
Step 4: Summarize
After transcription, read the transcript file and produce a summary using the format below.
Summary Format
Generate summaries in this structure:
# [Title - inferred from content]
**Source**: filename.mp3
**Duration**: X min Y sec
**Date**: YYYY-MM-DD
**Speakers**: [if speaker diarization was used]
## Key Points
- Point 1
- Point 2
- ...
## Detailed Summary
[2-4 paragraph summary of the content organized by topic/chronology]
## Action Items
- [ ] Action item 1 (assigned to Speaker X, if applicable)
- [ ] Action item 2
## Notable Quotes
> "Direct quote from transcript" — Speaker X, [timestamp if available]
## Full Transcript
\x3Cdetails>
\x3Csummary>Click to expand full transcript\x3C/summary>
[Full transcript text here, with speaker labels and timestamps if available]
\x3C/details>
Adapt the template based on content type:
- Meeting: emphasize action items, decisions, speaker contributions
- Lecture/Talk: emphasize key concepts, learning points, structure
- Interview: emphasize Q&A pairs, key responses
- Podcast: emphasize topics discussed, interesting insights
API Reference
For full SenseASR API parameters and response formats, see api-reference.md.
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install audio-transcribe-summarize - 安装完成后,直接呼叫该 Skill 的名称或使用
/audio-transcribe-summarize触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
audio-transcribe-summarize 是什么?
Transcribe audio/video files to text and generate structured summaries using SenseAudio ASR API. Use when the user asks to transcribe, summarize, or take not... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 238 次。
如何安装 audio-transcribe-summarize?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install audio-transcribe-summarize」即可一键安装,无需额外配置。
audio-transcribe-summarize 是免费的吗?
是的,audio-transcribe-summarize 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
audio-transcribe-summarize 支持哪些平台?
audio-transcribe-summarize 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 audio-transcribe-summarize?
由 q1lin570(@q1lin570)开发并维护,当前版本 v1.0.1。