Volcengine Ata Subtitle
/install doubao-ata-subtitle
Volcengine ATA Subtitle (自动打轴)
Generate subtitles with automatic time alignment using Volcengine's ATA (Automatic Time Alignment) API.
Prerequisites
Set the following environment variables or create a config file:
Option A: Environment Variables
export VOLC_ATA_APP_ID="your-app-id"
export VOLC_ATA_TOKEN="your-access-token"
export VOLC_ATA_API_BASE="https://openspeech.bytedance.com"
Option B: Config File
Create ~/.volcengine_ata.conf:
[credentials]
appid = your-app-id
access_token = your-access-token
secret_key = your-secret-key
[api]
base_url = https://openspeech.bytedance.com
submit_path = /api/v1/vc/ata/submit
query_path = /api/v1/vc/ata/query
Execution (Python CLI Tool)
A Python CLI tool is provided at ~/.openclaw/workspace/skills/volcengine-ata-subtitle/volc_ata.py.
Quick Examples
# Basic usage: audio + text → SRT subtitle
python3 ~/.openclaw/workspace/skills/volcengine-ata-subtitle/volc_ata.py \
--audio storage/audio.wav \
--text storage/subtitle.txt \
--output storage/subtitles/final.srt
# Specify output format (srt or ass)
python3 ~/.openclaw/workspace/skills/volcengine-ata-subtitle/volc_ata.py \
--audio storage/audio.wav \
--text storage/subtitle.txt \
--output storage/subtitles/final.ass \
--format ass
Input Requirements
Audio File
- Format: WAV (PCM)
- Sample Rate: 16000 Hz (16kHz)
- Channels: 1 (mono)
- Encoding: 16-bit PCM (
pcm_s16le)
Extract from video:
ffmpeg -i input.mp4 -vn -acodec pcm_s16le -ar 16000 -ac 1 audio.wav
Text File
- Format: Plain text (UTF-8)
- Structure: One sentence per line
- No punctuation: ATA will handle automatically
- No timestamps: Pure text only
Example:
主人闹钟没响睡过头了
我们俩轮流用鼻子拱他脸
他以为地震了抱着枕头就跑
Output Formats
SRT (SubRip)
1
00:00:00,000 --> 00:00:02,500
第一句字幕
2
00:00:02,500 --> 00:00:05,000
第二句字幕
ASS (Advanced Substation Alpha)
[Script Info]
Title: ATA Subtitles
ScriptType: v4.00+
[Events]
Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
Dialogue: 0,0:00:00.00,0:00:02.50,Default,,0,0,0,,第一句字幕
Rules
- Always check that credentials are configured before making API calls.
- Audio must be 16kHz mono PCM - convert if necessary with ffmpeg.
- Text should be plain - no timestamps, no punctuation.
- Default format: SRT (most compatible).
- Handle errors gracefully - display clear error messages.
Troubleshooting
Invalid Sample Rate
Error: Invalid sample rate, expected 16000Hz
Fix:
ffmpeg -i input.mp4 -ar 16000 -ac 1 audio.wav
Authorization Failed
Error: Authorization failed
Fix: Check token format. Should be Bearer; {token} (with semicolon).
Related Documents
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install doubao-ata-subtitle - After installation, invoke the skill by name or use
/doubao-ata-subtitle - Provide required inputs per the skill's parameter spec and get structured output
What is Volcengine Ata Subtitle?
Generate subtitles with automatic time alignment using Volcengine ATA API. Use when the user wants to: (1) add time-aligned subtitles to videos, (2) convert... It is an AI Agent Skill for Claude Code / OpenClaw, with 324 downloads so far.
How do I install Volcengine Ata Subtitle?
Run "/install doubao-ata-subtitle" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Volcengine Ata Subtitle free?
Yes, Volcengine Ata Subtitle is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does Volcengine Ata Subtitle support?
Volcengine Ata Subtitle is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Volcengine Ata Subtitle?
It is built and maintained by BlackEight4752 (@blackeight4752); the current version is v0.1.0.