← 返回 Skills 市场

Audio2srtlocal

Name: Audio2srtlocal
Author: yun520-1

作者 yun520-1 · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ⚠ suspicious

总下载

当前安装

版本数

在 OpenClaw 中安装

/install audio2srtlocal

功能描述

Local-generate and deploy the Audio2SRT project — a MLX Whisper audio transcription and translation Web GUI for Apple Silicon Macs. Unlike audio2srt-deploy (...

使用说明 (SKILL.md)

audio2srt-localgen

在本地从零生成并部署 Audio2SRT 项目 —— MLX Whisper 音频转录与翻译 Web GUI 工具。

与 audio2srt-deploy（需要从 Gitee 克隆）不同，本 Skill 将所有源代码模板内嵌，执行时直接在目标目录生成完整项目，无需任何网络访问即可获得源码。

触发条件

以下关键词或意图触发本 Skill：

触发词（中文）	触发词（英文）
本地生成audio2srt	generate Audio2SRT locally
本地搭建转录工具	create Audio2SRT project
生成音频转录项目	local deploy mlx whisper
不用克隆部署audio2srt	setup whisper tool without git
从零搭建音频转录	build whisper GUI from scratch

区分 audio2srt-deploy vs audio2srt-localgen：

用户提到"克隆""下载""Gitee" → 使用 audio2srt-deploy
用户提到"本地生成""从零搭建""不用克隆""离线部署" → 使用本 Skill (audio2srt-localgen)
用户只说"部署 audio2srt"且无已有项目 → 默认使用本 Skill（本地生成更可靠）

前置条件

macOS 14.0+ with Apple Silicon (M1/M2/M3/M4)
Python 3.10+ 已安装
Node.js 18+ 已安装

执行流程

Step 1: 确认目标目录

询问用户希望将项目生成到哪个目录。默认值 ~/audio2srt。

如果目录已存在且非空，询问是否覆盖（清空重建）或选择其他目录。

重要：不要在用户的工作目录或桌面下直接生成，建议使用 ~/audio2srt 或用户明确指定的路径。

Step 2: 生成项目文件

按以下目录结构，逐个使用 Write 工具生成所有文件。所有文件内容均来自本 Skill 的 references 目录中的模板文件。

目标目录结构：

TARGET_DIR/
├── server/
│   ├── transcribe_server.py   # Python 后端（MLX Whisper + mlx-lm）
│   └── requirements.txt       # Python 依赖
├── src/
│   ├── components/
│   │   ├── DropZone.tsx       # 文件拖拽上传
│   │   ├── FileCard.tsx       # 任务卡片
│   │   ├── FileList.tsx       # 文件列表
│   │   ├── Header.tsx         # 顶部栏
│   │   ├── ResultPanel.tsx    # 转录结果面板
│   │   ├── SrtTranslatePage.tsx # SRT 翻译页面
│   │   ├── StatsBar.tsx       # 统计面板
│   │   └── TranscriptionSettings.tsx # 转录参数设置
│   ├── store/
│   │   └── queueStore.ts      # Zustand 状态管理
│   ├── types/
│   │   └── index.ts           # TypeScript 类型定义
│   ├── utils/
│   │   └── helpers.ts         # 工具函数和 API 客户端
│   ├── App.tsx                # 应用主组件
│   ├── main.tsx               # 入口文件
│   └── index.css              # 全局样式
├── models/                    # 模型目录（gitignored，运行时下载）
├── start.sh                   # 一键启动脚本
├── package.json               # Node.js 依赖
├── index.html                 # HTML 入口
├── vite.config.ts             # Vite 配置
├── tsconfig.json              # TypeScript 配置
├── tsconfig.node.json         # TypeScript Node 配置
├── tailwind.config.js         # Tailwind CSS 配置
├── postcss.config.js          # PostCSS 配置
├── .gitignore                 # Git 忽略规则
└── LICENSE                    # MIT 许可证

生成顺序（按依赖关系）：

配置文件：package.json, tsconfig.json, tsconfig.node.json, vite.config.ts, tailwind.config.js, postcss.config.js, index.html, .gitignore, LICENSE
后端文件：server/requirements.txt, server/transcribe_server.py
前端类型和工具：src/types/index.ts, src/utils/helpers.ts, src/index.css
前端组件：src/store/queueStore.ts, src/components/*.tsx, src/App.tsx, src/main.tsx
启动脚本：start.sh

关键要求：

读取 references/ 目录下的模板文件，使用 Write 工具逐个写入目标路径
写入 start.sh 后必须执行 chmod +x 赋予执行权限
确保 models/ 目录存在（mkdir -p TARGET_DIR/models）

Step 3: 安装 Python 依赖

cd TARGET_DIR
pip3 install -r server/requirements.txt

如果 pip 失败：

尝试 pip3 install --user -r server/requirements.txt
或建议使用虚拟环境：python3 -m venv venv && source venv/bin/activate && pip install -r server/requirements.txt

Step 4: 安装 Node.js 依赖

cd TARGET_DIR
npm install

Step 5: 启动服务

cd TARGET_DIR
./start.sh

start.sh 自动执行：

检测 models/whisper-large-v3-turbo 和 models/Qwen2.5-3B-Instruct-4bit 是否存在
从 ModelScope 下载缺失模型（首次约需 5~10 分钟，共约 4GB+）
启动 Python 后端（端口 8765）
启动 Vite 前端（端口 3000）

Step 6: 打开应用

启动成功后，打开浏览器访问 http://localhost:3000。

模型来源

模型	ModelScope ID	本地路径
Whisper 转录	`mlx-community/whisper-large-v3-turbo-4bit`	`models/whisper-large-v3-turbo`
Qwen 翻译	`mlx-community/Qwen2.5-3B-Instruct-4bit`	`models/Qwen2.5-3B-Instruct-4bit`

模型总计约 4GB+，首次下载需 5~10 分钟。

服务端口

服务	端口	URL
前端 (Vite + React)	3000	http://localhost:3000
后端 (Python aiohttp)	8765	http://localhost:8765

故障排除

模型下载失败

确保 ModelScope CLI 可用：

pip3 install modelscope
python3 -m modelscope.cli.download --model mlx-community/whisper-large-v3-turbo-4bit --local_dir models/whisper-large-v3-turbo
python3 -m modelscope.cli.download --model mlx-community/Qwen2.5-3B-Instruct-4bit --local_dir models/Qwen2.5-3B-Instruct-4bit

端口被占用

lsof -ti:3000 | xargs kill -9
lsof -ti:8765 | xargs kill -9

M4A/MP3 格式不被识别

工具通过 macOS afconvert 自动转换。确保 Xcode Command Line Tools 已安装。

npm install 失败

删除 node_modules 和 package-lock.json 后重试：

rm -rf node_modules package-lock.json
npm install

安全使用建议

Install only if you are comfortable with a local app that writes a project directory, installs dependencies, downloads several GB of models, and runs web services. Run it on a trusted machine and network, prefer a fresh empty target directory, and change the backend to bind to localhost only or otherwise restrict access before using it with private audio.

能力评估

ℹ Purpose & Capability

The core behavior is coherent for a local Audio2SRT generator: it writes a project from embedded templates, installs Python/Node dependencies, downloads ML models, and runs a web UI for transcription and translation.

⚠ Instruction Scope

The skill discloses ports and setup steps, but the generated Python backend binds to 0.0.0.0 with permissive CORS and accepts file_path requests, which is broader network/API exposure than the localhost-focused instructions describe.

ℹ Install Mechanism

pip install, npm install, chmod +x, and ModelScope model downloads are high-impact local setup actions, but they are visible in the skill instructions and are aligned with deploying the app.

⚠ Credentials

A local transcription tool can handle sensitive audio and transcripts; exposing the backend on all interfaces without authentication or clear warning is not proportionate to the documented local-only workflow.

ℹ Persistence & Privilege

The skill creates a persistent project directory, dependencies, models, and an executable start script, but it does not add autostart, credential access, or hidden background persistence. It instructs asking before clearing a non-empty target directory.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install audio2srtlocal
安装完成后，直接呼叫该 Skill 的名称或使用 /audio2srtlocal 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

- Initial release of audio2srt-localgen: a fully local Audio2SRT project generator for Apple Silicon Macs, requiring zero network access for source code. - Generates the entire frontend (Vite + React) and backend (Python aiohttp, MLX Whisper) from embedded templates instead of cloning from remote repositories. - Installs all necessary Python and Node.js dependencies, prepares model directories, and auto-downloads MLX models from ModelScope. - Provides a one-click startup script that launches both backend (port 8765) and frontend (port 3000). - Designed to be triggered by keywords/requests indicating the user wants a zero-clone, locally generated audio2srt deployment.

元数据

Slug audio2srtlocal

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

Audio2srtlocal 是什么？

Local-generate and deploy the Audio2SRT project — a MLX Whisper audio transcription and translation Web GUI for Apple Silicon Macs. Unlike audio2srt-deploy (... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 26 次。

如何安装 Audio2srtlocal？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install audio2srtlocal」即可一键安装，无需额外配置。

Audio2srtlocal 是免费的吗？

是的，Audio2srtlocal 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

Audio2srtlocal 支持哪些平台？

Audio2srtlocal 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Audio2srtlocal？

由 yun520-1（@yun520-1）开发并维护，当前版本 v1.0.0。