/install realtime-transcription
Real-time Transcription Skill
Capture any audio, get a structured summary. Real-time transcription powered by SenseVoice/FunASR.
Features
- Real-time transcription — stream audio from system (BlackHole) or microphone
- Auto summary — on stop, generate title + structured summary
- Date-based archival — results saved to
archive/YYYY/MM/DD-HHMM-title.md - Idle detection — auto-stops after 60s of silence (configurable)
Skill Location
All files are in ~/.openclaw/skills/realtime-transcription/:
realtime-transcription/
├── SKILL.md # This file
├── realtime_asr.py # Background transcription process
├── summary_prompt.py # LLM prompt builder & response parser
├── archiver.py # Markdown archival module
├── references/
│ └── module-reference.md # Module API reference
├── .tmp/ # Runtime temp files
└── archive/ # Archived outputs
Prerequisites
Python Dependencies
pip3 install sounddevice librosa funasr torch numpy
Or use the built-in installer with progress output:
cd ~/.openclaw/skills/realtime-transcription
python3 realtime_asr.py --install-deps
System Audio (optional, macOS)
For macOS system audio capture, install BlackHole: brew install blackhole-2ch
ASR Model
Download the SenseVoice model: modelscope download --model gongjy/SenseVoiceSmall --local_dir ./model/SenseVoiceSmall
Quick Start
Check Dependencies
cd ~/.openclaw/skills/realtime-transcription
python3 realtime_asr.py --check-deps
Expected output:
✅ 所有依赖已安装。
sounddevice — PyAudio binding for microphone/system audio capture
librosa — Audio resampling and preprocessing
funasr — SenseVoice ASR model framework
torch — PyTorch deep learning runtime
numpy — Numerical array processing
If dependencies are missing, run python3 realtime_asr.py --install-deps to install them one by one with progress output.
Start Transcription
System audio (BlackHole):
cd ~/.openclaw/skills/realtime-transcription
python3 realtime_asr.py --source blackhole
Microphone:
cd ~/.openclaw/skills/realtime-transcription
python3 realtime_asr.py --source mic
With custom idle timeout (5 minutes):
cd ~/.openclaw/skills/realtime-transcription
python3 realtime_asr.py --source mic --idle-timeout 300
Disable idle timeout:
cd ~/.openclaw/skills/realtime-transcription
python3 realtime_asr.py --source mic --idle-timeout 0
Stop Transcription
Press Ctrl+C in the terminal, or:
kill $(cat .tmp/asr.pid 2>/dev/null) 2>/dev/null; rm -f .tmp/asr.pid
After Stopping — Summary & Archive
- Read the transcript:
cat .tmp/transcript.txt - Build the LLM prompt:
cd ~/.openclaw/skills/realtime-transcription python3 -c "
from summary_prompt import build_summary_prompt print(build_summary_prompt(open('.tmp/transcript.txt').read())) "
3. Send the prompt to yourself (the LLM) to generate TITLE + SUMMARY
4. Parse and archive:
```bash
cd ~/.openclaw/skills/realtime-transcription
python3 -c "
from summary_prompt import parse_summary_response
from archiver import archive
transcript = open('.tmp/transcript.txt').read()
result = parse_summary_response('YOUR_LLM_RESPONSE_HERE')
path = archive(transcript, result['title'], result['summary'], 'blackhole')
print(f'Archived to: {path}')
"
CLI Reference
| Flag | Default | Description |
|---|---|---|
--source |
blackhole |
blackhole (system) or mic |
--output |
.tmp/transcript.txt |
Transcript file path |
--state |
.tmp/asr.pid |
PID file for process management |
--model |
./model/SenseVoiceSmall |
SenseVoice model directory |
--idle-timeout |
60 |
Auto-stop after N seconds of silence (0=disable) |
--device |
auto | Audio device ID override |
--check-deps |
— | Check dependencies and exit |
--install-deps |
— | Install missing dependencies with progress output |
--list-devices |
— | List available audio input devices |
Trigger Words
| User says | Action |
|---|---|
| "开始转录" / "transcribe" / "启动转录" | Check deps → ask source → start |
| "停止" / "stop" | Stop process → summary → archive |
| "当前转录内容" | Show .tmp/transcript.txt |
| "检查依赖" | Run --check-deps |
Output Format
Transcript (.tmp/transcript.txt)
[14:30:00] 你好今天我们来讨论一下AI的发展
[14:30:05] AI技术在各个领域都有广泛应用
Archive (archive/YYYY/MM/DD-HHMM-title.md)
---
title: "AI发展趋势讨论"
date: 2025-05-16
time: "14:30 - 14:38"
source: blackhole
duration: 8m
---
## 摘要
- AI在医疗、金融、教育领域广泛应用
- 未来将更智能和普及
## 完整转录
[14:30:00] 你好今天我们来讨论一下AI的发展
...
Error Handling
| Scenario | Behavior |
|---|---|
| Missing dependencies | Refuse to start, show install instructions |
| BlackHole not found | Suggest --source mic |
| Process crashes | PID file gone → offer to recover |
| Empty transcript | Warn user, skip summary, no archive |
| No sound for N seconds | Exit code 42, ask user to continue |
Exit Codes
| Code | Meaning |
|---|---|
| 0 | Normal stop |
| 1 | Dependency check failed |
| 42 | Idle timeout — ask user: "⏸️ 已 N 秒没有检测到声音,是否继续录音?(y/n)" |
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install realtime-transcription - 安装完成后,直接呼叫该 Skill 的名称或使用
/realtime-transcription触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
realtime-transcription 是什么?
Real-time transcription of system or microphone audio with automatic summary generation and date-based Markdown archival after stopping or idle timeout. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 41 次。
如何安装 realtime-transcription?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install realtime-transcription」即可一键安装,无需额外配置。
realtime-transcription 是免费的吗?
是的,realtime-transcription 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
realtime-transcription 支持哪些平台?
realtime-transcription 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 realtime-transcription?
由 Lee(@leeleoo)开发并维护,当前版本 v1.0.0。