/install audio-transcribe-summarize
Audio/Video Transcription & Summarization
Transcribe audio/video files using the SenseASR API (api.senseaudio.cn), then summarize the content into structured notes.
{baseDir} refers to this skill's directory.
Prerequisites
- Environment variable
SENSEAUDIO_API_KEYconfigured (get your key at https://senseaudio.cn/platform/api-key) - Python 3.8+ with
requestsinstalled - For large files (>10MB):
ffmpeginstalled for splitting(macOS:brew install ffmpeg,Windows: ffmpeg.org 下载并加入 PATH,Linux:apt install ffmpeg)
Quick Start
- Run the transcription script:
python {baseDir}/scripts/transcribe.py \x3Caudio_file> [--model sense-asr-pro] [--language zh] [--speakers] [--sentiment] [--translate en]
- The script outputs a transcript
.txtfile alongside the source file - Read the transcript and generate a summary (see Summary Format below)
Workflow
Step 1: Assess the Audio File
Check file size and format:
- Supported formats: wav, mp3, ogg, flac, aac, m4a, mp4
- Max file size per request: 10MB
- If file > 10MB, the script auto-splits using ffmpeg
Step 2: Choose the Right Model
| Model | Use When |
|---|---|
sense-asr-lite |
Quick batch transcription, simple audio, cost-sensitive |
sense-asr |
General transcription, need speaker separation or timestamps |
sense-asr-pro |
High accuracy needed: meetings, interviews, complex audio |
sense-asr-deepthink |
Noisy audio, dialects, heavy jargon, speech-to-clean-text |
Default to sense-asr-pro for best quality.
Step 3: Transcribe
Run the transcription script. Key options:
# Basic transcription
python {baseDir}/scripts/transcribe.py recording.mp3
# Meeting with multiple speakers + emotion
python {baseDir}/scripts/transcribe.py meeting.wav \
--model sense-asr-pro \
--speakers --max-speakers 4 \
--sentiment \
--timestamps segment
# Transcribe and translate to English
python {baseDir}/scripts/transcribe.py lecture.mp3 \
--model sense-asr \
--translate en
Step 4: Summarize
After transcription, read the transcript file and produce a summary using the format below.
Summary Format
Generate summaries in this structure:
# [Title - inferred from content]
**Source**: filename.mp3
**Duration**: X min Y sec
**Date**: YYYY-MM-DD
**Speakers**: [if speaker diarization was used]
## Key Points
- Point 1
- Point 2
- ...
## Detailed Summary
[2-4 paragraph summary of the content organized by topic/chronology]
## Action Items
- [ ] Action item 1 (assigned to Speaker X, if applicable)
- [ ] Action item 2
## Notable Quotes
> "Direct quote from transcript" — Speaker X, [timestamp if available]
## Full Transcript
\x3Cdetails>
\x3Csummary>Click to expand full transcript\x3C/summary>
[Full transcript text here, with speaker labels and timestamps if available]
\x3C/details>
Adapt the template based on content type:
- Meeting: emphasize action items, decisions, speaker contributions
- Lecture/Talk: emphasize key concepts, learning points, structure
- Interview: emphasize Q&A pairs, key responses
- Podcast: emphasize topics discussed, interesting insights
API Reference
For full SenseASR API parameters and response formats, see api-reference.md.
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install audio-transcribe-summarize - After installation, invoke the skill by name or use
/audio-transcribe-summarize - Provide required inputs per the skill's parameter spec and get structured output
What is audio-transcribe-summarize?
Transcribe audio/video files to text and generate structured summaries using SenseAudio ASR API. Use when the user asks to transcribe, summarize, or take not... It is an AI Agent Skill for Claude Code / OpenClaw, with 238 downloads so far.
How do I install audio-transcribe-summarize?
Run "/install audio-transcribe-summarize" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is audio-transcribe-summarize free?
Yes, audio-transcribe-summarize is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does audio-transcribe-summarize support?
audio-transcribe-summarize is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created audio-transcribe-summarize?
It is built and maintained by q1lin570 (@q1lin570); the current version is v1.0.1.