/install douyin-transcriber
Douyin Transcriber
Transcribe audio/video files to text using local Docker Whisper ASR.
Quick Start
curl -X POST "http://localhost:PORT/asr" -F "audio_file=@/path/to/video.mp4"
The container has built-in ffmpeg for automatic audio extraction.
Prerequisites
| Tool | Purpose | Install |
|---|---|---|
| Docker | Whisper ASR | Docker Desktop |
| ffmpeg | Audio extraction | winget install Gyan.FFmpeg |
Deploy Whisper ASR:
docker run -d -p PORT:PORT -e ASR_MODEL=small -e ASR_ENGINE=faster_whisper --name whisper-asr onerahmet/openai-whisper-asr-webservice:latest
Workflow
Step 1: Extract Audio from Video
ffmpeg -i video.mp4 -ar 16000 -ac 1 -c:a pcm_s16le audio.wav -y
Parameters:
-ar 16000: 16kHz sample rate-ac 1: Mono channel-c:a pcm_s16le: 16-bit PCM
Step 2: Transcribe
curl -X POST "http://localhost:PORT/asr" -F "[email protected]"
Optional: specify language
curl -X POST "http://localhost:PORT/asr" -F "[email protected]" -F "language=zh"
Step 3: Parse Result
Response format:
{
"text": "Transcribed content...",
"segments": [
{"start": 0.0, "end": 2.5, "text": "First sentence"},
{"start": 2.5, "end": 5.0, "text": "Second sentence"}
],
"language": "zh"
}
Model Selection
| Model | Size | 5-min video | Accuracy |
|---|---|---|---|
| tiny | 75MB | ~30s | Fair |
| base | 142MB | ~1min | Good |
| small | 466MB | ~3min | Better (recommended) |
| medium | 1.5GB | ~8min | Best |
Change model via environment variable: -e ASR_MODEL=medium
Supported Formats
Video: mp4, mkv, avi, mov, flv, wmv, webm, m4v
Audio: wav, m4a, mp3, aac, ogg, flac, wma, opus
Troubleshooting
| Issue | Solution |
|---|---|
| Docker not available | Install Docker Desktop |
| Container start fails | Check port availability |
| Transcription timeout | Use smaller model or split audio |
| ffmpeg not found | winget install Gyan.FFmpeg |
Related Modules
- douyin-fetcher - Video download
- douyin-analyzer - Content analysis
- douyin-orchestrator - Workflow coordination
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install douyin-transcriber - 安装完成后,直接呼叫该 Skill 的名称或使用
/douyin-transcriber触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
Douyin Transcriber 是什么?
Transcribe speech from audio or video files, automatically extracting audio and converting to text using Docker Whisper ASR for Douyin/TikTok media. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 116 次。
如何安装 Douyin Transcriber?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install douyin-transcriber」即可一键安装,无需额外配置。
Douyin Transcriber 是免费的吗?
是的,Douyin Transcriber 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Douyin Transcriber 支持哪些平台?
Douyin Transcriber 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Douyin Transcriber?
由 Don Li(@don068589)开发并维护,当前版本 v1.0.5。