Description

Extracts audio tracks from video files and splits long audio into timed segments using Volcengine LAS. Audio extraction and separation from video — pull audi...

README (SKILL.md)

LAS 音频提取与切分（`las_audio_extract_and_split`）

Name: Byted Las Long Video Understand
Author: volcengine-skills

从视频/音频中提取音频轨道并按固定时长切分为多段。支持 mp4/wmv/avi/mkv 等视频格式。

设计模式

本 skill 主要采用：

Tool Wrapper：封装 lasutil CLI 调用
Pipeline：包含 Step 0 → Step N 的顺序工作流

核心 API 与配置

算子 ID: las_audio_extract_and_split
API: 同步（process）
环境变量: LAS_API_KEY (必填)

详细参数与接口定义见 references/api.md。

Gotchas

路径模板必填：output_path_template 必须包含 {index} 变量，否则所有切片会写入同一文件。
密钥安全：若聊天框屏蔽密钥，让用户在当前目录创建 env.sh 并写入 export LAS_API_KEY="..."，SDK 会自动读取。
免责声明：最终回复结果时必须包含："本方式的计费均为预估计费，与实际费用有差距，实际费用以运行后火山产生的账单为准。计费说明请参考 Volcengine LAS 定价。"，且禁止使用"实际费用"字眼描述预估价。

工作流（严格按步骤执行）

复制此清单并跟踪进度：

执行进度：
- [ ] Step 0: 前置检查
- [ ] Step 1: 初始化与准备
- [ ] Step 2: 预估价格
- [ ] Step 3: 执行/提交任务
- [ ] Step 4: 结果呈现

Step 0: 前置检查（⚠️ 必须在第一轮对话中完成）

在接受用户的任务后，不要立即开始执行，必须首先进行以下环境检查：

检查 LAS_API_KEY 与 LAS_REGION：确认环境变量或 .env 中是否已配置。
- 若无，必须立即向用户索要（提示：LAS_REGION 常见为 cn-beijing）。
- 注意：LAS_REGION 必须与您的 API Key 及 TOS Bucket 所在的地域完全一致。如果用户中途切换了 Region，必须提醒用户其 TOS Bucket 也需对应更换，否则会导致权限异常或上传失败。
检查输入路径：
- 如果用户要求处理的是本地文件，则需要先通过 File API 上传至 TOS（只需 LAS_API_KEY，无需额外 TOS 凭证）。
- 如果算子的输出结果存放在 TOS 上，且用户需要下载回本地，则需要 VOLCENGINE_ACCESS_KEY 和 VOLCENGINE_SECRET_KEY。对于仅需要上传输入文件的场景，TOS 凭证不再必须。
确认无误后：才能进入下一步。

Step 1: 初始化与准备

环境初始化（Agent 必做）：

# 执行统一的环境初始化与更新脚本（会自动创建/激活虚拟环境，并检查更新）
source "$(dirname "$0")/scripts/env_init.sh" las_audio_extract_and_split
workdir=$LAS_WORKDIR

如果网络问题导致更新失败，脚本会跳过检查，使用本地已安装的 SDK 继续执行。

处理本地文件时：先本地检查格式和时长，预估价格，用户确认后再上传：

# 提前检查容器格式（避免参数错误）
./scripts/check_format.sh \x3Clocal_path>
# 本地使用 ffprobe 获取时长（无需上传即可预估价格）
duration_sec=$(ffprobe -v error -show_entries format=duration -of default=noprint_wrappers=1:noprint_section=1 \x3Clocal_path>)

计算预估价格并等待用户确认后，再执行上传：

# 用户确认后，上传到 TOS
lasutil file-upload \x3Clocal_path>

上传成功后返回 JSON，取其中的 tos_uri（格式 tos://bucket/key）传给算子作为输入路径。

Step 2: 预估价格（⚠️ 必须获得用户确认）

读取 references/prices.md 获取最新计费标准。

优先本地获取时长（避免不必要上传）：

# 使用 ffprobe 本地获取
duration_sec=$(ffprobe -v error -show_entries format=duration -of default=noprint_wrappers=1:noprint_section=1 \x3Clocal_path>)

如果 ffprobe 失败，再使用 lasutil 远程获取：

lasutil media-duration \x3Cinput_url>

根据时长和模型单价计算总价，将计费单价与预估总价一并告知用户并强制暂停执行，明确等待用户回复确认。在用户明确回复"继续"、"确认"等同意指令前，绝对禁止进入下一步（执行/提交任务）。提示：预估仅供参考，实际以火山账单为准。计费说明请参考 Volcengine LAS 定价。

Step 3: 执行切分 (Process)

构造基础 data.json：

{
  "input_path": "\x3Cpresigned_url>",
  "output_path_template": "tos://bucket/output/{index}.wav",
  "split_duration": 30,
  "output_format": "wav"
}

执行命令：

data=$(cat "$workdir/data.json")
lasutil process las_audio_extract_and_split "$data" > "$workdir/result.json"

结果呈现

使用脚本自动生成结果展示（自动包含计费声明）：

./scripts/generate_result.md.sh $workdir/result.json \x3Cestimated_price>

生成内容包含：

任务信息卡片
自动生成切分结果表格
自动包含计费声明 ✅

手动提取方式：

total=$(jq '.data.output_paths | length' $workdir/result.json)
echo "共 ${total} 个片段"
jq -r '.data.output_paths[] | "  - " + .' $workdir/result.json

审查标准

执行完成后，Agent 应自检：

环境变量是否正确配置
输入文件是否成功上传
输出结果是否正确呈现给用户
计费声明是否包含

Usage Guidance

This skill appears to do what it says (extract/split audio) but has a few red flags you should address before running it: 1) Confirm and require the SKILL metadata to declare LAS_API_KEY and any VOLCENGINE_* keys (the SKILL.md expects them but the registry shows none). 2) Inspect the remote wheel and manifest URLs (https://las-ai-cn-beijing-online.tos-cn-beijing.volces.com/...) and ask for a checksum or official release link (prefer GitHub releases or PyPI over an opaque host). 3) Be cautious about storing keys in env.sh in working directories—prefer passing credentials via secure vaults or ephemeral prompts. 4) Note the scripts call ffprobe and lasutil; ensure those binaries are available and legitimate. 5) If you will run this in a production environment, run it first in an isolated sandbox where pip-installed remote code cannot access sensitive systems. If the skill author can provide: (a) an explicit list of required env vars and binaries in the registry metadata, (b) a signed/checked URL or hash for the wheel, or (c) a vetted install from PyPI/GitHub, your confidence in installing it would increase.

Capability Analysis

Type: OpenClaw Skill Name: byted-las-long-video-understand Version: 1.0.1 The skill bundle performs remote code execution by downloading and installing a Python wheel file and a manifest from a remote URL (volces.com) within `scripts/env_init.sh`. While the domain appears to be the official Volcengine (ByteDance) cloud service, fetching and installing binaries directly from a remote bucket instead of a verified package repository is a high-risk supply chain behavior. Additionally, `SKILL.md` instructs the agent to execute this initialization script automatically, which could be leveraged if the remote endpoint is compromised.

Capability Tags

requires-sensitive-credentials

Capability Assessment

⚠ Purpose & Capability

The SKILL.md requires LAS_API_KEY (and may require VOLCENGINE_ACCESS_KEY/VOLCENGINE_SECRET when downloading results) and expects lasutil/ffprobe, but the registry metadata lists no required env vars or binaries. Asking for cloud credentials (VOLCENGINE_*) and using a TOS upload flow is consistent with a Volcengine LAS integration, but the skill manifest did not declare these requirements—this mismatch is incoherent and surprising to an installer.

ℹ Instruction Scope

Instructions are prescriptive and stay within the stated task (check keys, optionally upload local files to TOS, estimate cost, call lasutil process, present results). However they also instruct creating/reading an env.sh in the current directory for keys and sourcing scripts that will fetch and install remote SDK code. The agent is instructed to request missing credentials from the user (expected) but also to read local env files and to run remote-updating logic—this expands the skill's runtime scope beyond simple API calls.

⚠ Install Mechanism

No formal install spec is provided, but scripts/env_init.sh will curl a remote manifest and then pip install a wheel from https://las-ai-cn-beijing-online.tos-cn-beijing.volces.com/... (non-PyPI host). That causes execution of code fetched at runtime from a third-party URL (not a well-known, reviewed release host). This is higher risk than an instruction-only skill that uses only preinstalled binaries.

⚠ Credentials

SKILL.md requires LAS_API_KEY (explicit) and may require VOLCENGINE_ACCESS_KEY/VOLCENGINE_SECRET for certain flows, yet the registry lists no required environment variables. The skill also encourages storing an API key in env.sh in the working directory. Requesting cloud API keys for uploading results is plausible for this purpose, but the omission from metadata plus instructions to create/read local env files (which may contain secrets) is disproportionate/unexpected and should be explicitly called out to users.

✓ Persistence & Privilege

always is false and the skill does not request permanent platform-wide privileges. It creates a virtualenv in project root or current directory and may install an SDK into that environment at runtime; it does not claim to modify other skills or system-wide agent settings.

Version History

v1.0.1

Audio extraction and splitting skill, initial public release. - Extracts audio tracks from various video formats (mp4, wmv, avi, mkv, mov, flv). - Splits long audio into fixed-duration segments with customizable length and indexed file naming. - Implements a strict step-by-step workflow with environment and input checks, price estimation, and explicit user confirmation before processing. - Includes clear billing information and usage disclaimers referencing Volcengine LAS pricing. - Requires environment variables (`LAS_API_KEY`, `LAS_REGION`) and supports both local and TOS input/output handling.

Metadata

Slug byted-las-long-video-understand

Version 1.0.1

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Byted Las Long Video Understand?

Extracts audio tracks from video files and splits long audio into timed segments using Volcengine LAS. Audio extraction and separation from video — pull audi... It is an AI Agent Skill for Claude Code / OpenClaw, with 60 downloads so far.

How do I install Byted Las Long Video Understand?

Run "/install byted-las-long-video-understand" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Byted Las Long Video Understand free?

Yes, Byted Las Long Video Understand is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Byted Las Long Video Understand support?

Byted Las Long Video Understand is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Byted Las Long Video Understand?

It is built and maintained by volcengine-skills (@volcengine-skills); the current version is v1.0.1.

More Skills

Byted Las Long Video Understand

LAS 音频提取与切分（las_audio_extract_and_split）