← 返回 Skills 市场
sariel2018

Audio SRT Workflow

作者 Sariel2018 · GitHub ↗ · v0.1.2 · MIT-0
cross-platform ✓ 安全检测通过
113
总下载
1
收藏
0
当前安装
3
版本数
在 OpenClaw 中安装
/install audio-srt-workflow
功能描述
Generate or align SRT subtitles from audio using this repository. Use when the user asks for subtitle generation, transcript-to-audio alignment, timing clean...
使用说明 (SKILL.md)

Audio SRT Workflow

Use this skill for end-to-end subtitle work.

This package is self-contained for runtime entrypoints:

  • scripts/align_to_srt.py
  • scripts/gui_app.py
  • scripts/srt_stats.py
  • scripts/make_preview_mp4.py
  • scripts/requirements.txt

Scope

  • Mode A: audio + reference text -> aligned SRT
  • Mode B: audio only -> auto subtitle SRT
  • Timing QA with srt_stats.py
  • Burned preview generation with make_preview_mp4.py

Inputs To Collect First

  1. Audio path (wav, mp3, m4a, ...)
  2. Whether a reference transcript is available
  3. Output SRT path (or output directory)
  4. Language hint (zh, en, ...)
  5. Preferred run style: CLI, GUI, or Python API

Decision Rule

  • If transcript exists, run Mode A (align_to_srt.py --text ...).
  • If transcript does not exist, run Mode B via GUI or Python API (run_auto_subtitle_pipeline).

Workflow

  1. Validate environment and paths.
  2. Choose Mode A or Mode B by transcript availability.
  3. Run subtitle generation from packaged scripts.
  4. Run timing diagnostics (srt_stats.py).
  5. If needed, render a preview mp4 with burned subtitles.

Resolve Skill Script Path

Set a local variable to your installed skill directory.

Codex default path:

SKILL_DIR="${CODEX_HOME:-$HOME/.codex}/skills/audio-srt-workflow"

OpenClaw/ClawHub install path example:

SKILL_DIR="\x3Cyour-workdir>/skills/audio-srt-workflow"

Environment Checks

Run these checks before execution:

python3 --version
ffmpeg -version
python3 -c "import faster_whisper; print('ok')"

If faster-whisper import fails:

# Review dependencies before installing:
cat "$SKILL_DIR/scripts/requirements.txt"
pip install -r "$SKILL_DIR/scripts/requirements.txt"

Mode A Command Template (Audio + Transcript)

python3 "$SKILL_DIR/scripts/align_to_srt.py" \
  --audio "\x3Cinput_audio>" \
  --text "\x3Ctranscript_txt>" \
  --output "\x3Coutput_srt>" \
  --model small \
  --language zh

Mode B Command Template (Audio Only)

GUI:

python3 "$SKILL_DIR/scripts/gui_app.py"

Or use Python API in scripts:

  • Build config with build_alignment_config(...)
  • Run run_auto_subtitle_pipeline(...)

See command details in references/command-templates.md.

QA And Preview

Timing stats:

python3 "$SKILL_DIR/scripts/srt_stats.py" --srt "\x3Coutput_srt>"

Preview video:

python3 "$SKILL_DIR/scripts/make_preview_mp4.py" \
  --audio "\x3Cinput_audio>" \
  --srt "\x3Coutput_srt>" \
  --output "\x3Cpreview_mp4>"

Output Conventions

  • Default output uses .srt extension.
  • Prefer dated naming for batch runs (for example output_YYYYMMDD.srt).
  • Keep intermediate checks in a separate folder from final delivery files.

Notes

  • For Chinese output (zh), the pipeline strips commas/periods only.
  • If timings look off, inspect waveform snap related arguments before changing model size.
  • This skill requires explicit invocation (allow_implicit_invocation: false).
安全使用建议
This package appears to be what it says: an offline toolset to align/generate SRTs and render preview videos. Before installing/running: - Run it in a dedicated Python virtualenv to avoid affecting your system Python. - Confirm ffmpeg is the binary you expect (ffmpeg in PATH or pass --ffmpeg-bin). The preview tool shells out to ffmpeg to burn subtitles. - If you want to avoid network downloads, pre-download or place Whisper model files in a local directory and set FASTER_WHISPER_MODEL_DIR (the code checks this env var) so faster-whisper won't fetch large weights at runtime. - Be aware: when faster-whisper does need to fetch models it may use Hugging Face tooling which can pick up HF tokens from your environment; avoid running this skill in an environment that contains unexpected credentials you don't want used. - The GUI requires tkinter; the code will exit if tkinter is unavailable. - Inspect the included scripts (they are small and readable) and test with non-sensitive audio first. If you need higher assurance, run in an isolated environment or container to observe any network activity during model loading.
功能分析
Type: OpenClaw Skill Name: audio-srt-workflow Version: 0.1.2 The skill bundle provides a legitimate and well-structured workflow for generating and aligning SRT subtitles using the faster-whisper library and ffmpeg. Analysis of the Python scripts (align_to_srt.py, gui_app.py, make_preview_mp4.py) and the SKILL.md instructions reveals no evidence of malicious intent, data exfiltration, or prompt injection. The code uses standard practices for audio processing and subprocess management, and the environment checks and installation steps are consistent with the stated purpose of the tool.
能力评估
Purpose & Capability
Name/description match the included scripts and requirements: align_to_srt.py, gui_app.py, srt_stats.py, make_preview_mp4.py and a requirements.txt listing faster-whisper. Required tools referenced in SKILL.md (Python 3.10+, ffmpeg, faster-whisper) are appropriate for subtitle/transcription work. One minor mismatch: the code reads an optional environment variable FASTER_WHISPER_MODEL_DIR to locate models but this env var is not documented in the registry's required env list.
Instruction Scope
SKILL.md gives concrete invocation templates and environment checks (python version, ffmpeg, faster_whisper import) and directs the agent to run only the included scripts on user-supplied audio/text files. The instructions do not ask for unrelated files, secrets, or to transmit outputs to unusual external endpoints. Note: faster-whisper/model usage may download model weights from remote hosts when a model path is not local — this is expected for ASR but is a network behavior to be aware of.
Install Mechanism
No install spec is present (instruction-only), and dependencies are limited to a pinned faster-whisper package in scripts/requirements.txt. No arbitrary URL downloads or archive extraction are performed by the skill itself. The README suggests using a venv and pip install -r requirements.txt which is standard.
Credentials
The skill declares no required credentials (none in registry) which matches the code. However, the code optionally consults FASTER_WHISPER_MODEL_DIR to locate local model files (not declared in requires.env). Also, faster-whisper / underlying HF tooling may use existing Hugging Face credentials (e.g., HF_TOKEN) from the environment if the user has them configured when fetching private models; that credential access is implicit and not declared. Overall requested privileges are minimal and appropriate for the task, but be aware of the implicit model-download behavior and any credentials present in your environment.
Persistence & Privilege
always:false and allow_implicit_invocation:false in the agent metadata; the skill does not request permanent system presence, nor does it modify other skills or system-wide configs. It only runs local scripts and writes output files specified by the user.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install audio-srt-workflow
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /audio-srt-workflow 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v0.1.2
Security hardening: require explicit invocation, pin faster-whisper version, and add dependency-review step before install.
v0.1.1
Make package self-contained: include runtime scripts and requirements in skill bundle; update instructions to use packaged script paths.
v0.1.0
Initial release of audio-srt-workflow skill. - Generate or align SRT subtitles from audio, with or without a reference transcript. - Supports SRT timing cleanup, quality checks, and subtitle preview video rendering. - Offers both CLI, GUI, and Python API run styles. - Requires Python 3.10+, ffmpeg, and faster-whisper. - Includes workflows for audio with text (alignment) and audio only (auto-generation).
元数据
Slug audio-srt-workflow
版本 0.1.2
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 3
常见问题

Audio SRT Workflow 是什么?

Generate or align SRT subtitles from audio using this repository. Use when the user asks for subtitle generation, transcript-to-audio alignment, timing clean... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 113 次。

如何安装 Audio SRT Workflow?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install audio-srt-workflow」即可一键安装,无需额外配置。

Audio SRT Workflow 是免费的吗?

是的,Audio SRT Workflow 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Audio SRT Workflow 支持哪些平台?

Audio SRT Workflow 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Audio SRT Workflow?

由 Sariel2018(@sariel2018)开发并维护,当前版本 v0.1.2。

💬 留言讨论