Description

Analyzes and understands video content using Volcengine LAS Doubao vision-language models (VLM). Multimodal AI video analysis, video comprehension, and visua...

README (SKILL.md)

LAS 视频内容理解（`las_vlm_video`）

Name: Byted Las Vlm Video
Author: volcengine-skills

基于豆包视觉大模型对视频进行理解和分析（物体识别、动作分析、场景描述、摘要、字幕生成）。视频自动压缩到 50MB 以内再推理。

设计模式

本 skill 主要采用：

Tool Wrapper：封装 lasutil CLI 调用
Pipeline：包含 Step 0 → Step N 的顺序工作流

核心 API 与配置

算子 ID: las_vlm_video
API: 同步（process）
环境变量: LAS_API_KEY (必填)
视频限制: 需公网/火山内网可访问，≤ 1GiB，暂不支持理解音频信息。

详细参数与接口定义见 references/api.md。

Gotchas

不可精确预估：按 Token 计费，且视频压缩后 token 量动态变化，需明确告知用户计费模式。
密钥安全：若聊天框屏蔽密钥，让用户在当前目录创建 env.sh 并写入 export LAS_API_KEY="..."，SDK 会自动读取。
免责声明：最终回复结果时必须包含："本方式的计费均为预估计费，与实际费用有差距，实际费用以运行后火山产生的账单为准。计费说明请参考 Volcengine LAS 定价。"，且禁止使用"实际费用"字眼描述预估价。

工作流（严格按步骤执行）

复制此清单并跟踪进度：

执行进度：
- [ ] Step 0: 前置检查
- [ ] Step 1: 初始化与准备
- [ ] Step 2: 预估价格
- [ ] Step 3: 执行/提交任务
- [ ] Step 4: 结果呈现

Step 0: 前置检查（⚠️ 必须在第一轮对话中完成）

在接受用户的任务后，不要立即开始执行，必须首先进行以下环境检查：

检查 LAS_API_KEY 与 LAS_REGION：确认环境变量或 .env 中是否已配置。
- 若无，必须立即向用户索要（提示：LAS_REGION 常见为 cn-beijing）。
- 注意：LAS_REGION 必须与您的 API Key 及 TOS Bucket 所在的地域完全一致。如果用户中途切换了 Region，必须提醒用户其 TOS Bucket 也需对应更换，否则会导致权限异常或上传失败。
检查输入路径：
- 如果用户要求处理的是本地文件，则需要先通过 File API 上传至 TOS（只需 LAS_API_KEY，无需额外 TOS 凭证）。
- 如果算子的输出结果存放在 TOS 上，且用户需要下载回本地，则需要 VOLCENGINE_ACCESS_KEY 和 VOLCENGINE_SECRET_KEY。对于仅需要上传输入文件的场景，TOS 凭证不再必须。
确认无误后：才能进入下一步。

Step 1: 初始化与准备

环境初始化（Agent 必做）：

# 执行统一的环境初始化与更新脚本（会自动创建/激活虚拟环境，并检查更新）
source "$(dirname "$0")/scripts/env_init.sh" las_vlm_video
workdir=$LAS_WORKDIR

如果网络问题导致更新失败，脚本会跳过检查，使用本地已安装的 SDK 继续执行。

处理本地文件时：先本地检查格式和时长，告知预估后，用户确认再上传：

# 提前检查视频格式（避免参数错误）
./scripts/check_format.sh \x3Clocal_path>
# 本地使用 ffprobe 获取时长（无需上传即可预估token）
duration_sec=$(ffprobe -v error -show_entries format=duration -of default=noprint_wrappers=1:noprint_section=1 \x3Clocal_path>)

根据时长估算 token 量并等待用户确认后，再执行上传：

# 用户确认后，上传到 TOS
lasutil file-upload \x3Clocal_path>

上传成功后返回 JSON，取其中的 tos_uri（格式 tos://bucket/key）传给算子作为输入路径。

Step 2: 预估价格（⚠️ 必须获得用户确认）

本 skill 按 token 计费，提交前无法精确预估费用。需将以下单价表告知用户，由用户决定是否继续。

读取 references/prices.md 获取最新计费标准（或使用下方基本参考）。
优先本地获取视频时长帮助预估（避免不必要上传）：
```
# 使用 ffprobe 本地获取
duration_sec=$(ffprobe -v error -show_entries format=duration -of default=noprint_wrappers=1:noprint_section=1 \x3Clocal_path>)
```
根据时长估算 token 量后，告知用户 tos:// 内网访问更便宜。等待用户确认后才可继续。提示：预估仅供参考，实际以火山账单为准。计费说明请参考 Volcengine LAS 定价。

Step 3: 执行视频理解 (Process)

构造基础 data.json（详细参数与接口定义见 references/api.md）：

{
  "messages": [
    {"role": "user", "content": [
      {"type": "video_url", "video_url": {"url": "\x3Cpresigned_url>"}},
      {"type": "text", "text": "分析视频内容，输出要点列表"}
    ]}
  ],
  "model_name": "doubao-seed-1.6-vision"
}

执行命令：

data=$(cat "$workdir/data.json")
lasutil process las_vlm_video "$data" > "$workdir/result.json"

输出解析：模型返回文本在 result.data.vlm_result.choices[0].message.content 中。

Step 5: 结果呈现

处理结果：

# 保存结果到本地
mkdir -p "./output/{task_id}"
cat "./output/{task_id}/result.json" | jq -r '.data.summary' > "./output/{task_id}/summary.txt"
cat "./output/{task_id}/result.json" | jq '.data.events' > "./output/{task_id}/events.json"

上传结果文件（可选）：

# 上传摘要和事件列表
lasutil file-upload "./output/{task_id}/summary.txt"
lasutil file-upload "./output/{task_id}/events.json"

向用户展示：

视频摘要
关键事件列表（时间、描述）
本地文件路径：./output/{task_id}/
签名下载链接（如上传成功）
计费声明

审查标准

执行完成后，Agent 应自检：

环境变量是否正确配置
输入文件是否成功上传
输出结果是否正确呈现给用户
计费声明是否包含

Usage Guidance

This skill appears to implement Volcengine LAS video analysis, but there are several inconsistencies and potentially risky runtime actions you should consider before installing: - Metadata vs runtime mismatch: The registry claims no required env vars, yet SKILL.md and scripts require LAS_API_KEY and LAS_REGION (and may need VOLCENGINE_ACCESS_KEY/VOLCENGINE_SECRET_KEY for certain downloads). Ask the publisher to correct the manifest so requirements are explicit. - Remote code install: The env_init.sh script downloads a manifest and pip-installs a wheel from a custom TOS URL. Treat this like installing third-party code: verify the URL/hosting is legitimate (official Volcengine domain), prefer a signed/official release, and review the wheel contents if possible. If you cannot verify, run the skill only in an isolated environment (sandbox/container) rather than on a production host. - Secrets handling: The skill instructs creating/sourcing local env files containing API keys. Do not paste keys into chat. Prefer short-lived keys or scoped credentials, and store them in a secure location. Confirm the skill will not transmit secrets to unexpected endpoints. - Operational impact: The scripts may create a local virtualenv (.las_venv) and write files under /tmp or the project root. If that is undesirable, run in a disposable workspace. What would reduce risk: require the publisher to (1) update registry metadata to list required env vars and binaries (ffprobe, lasutil), (2) provide a trusted, signed install mechanism or an option to use an existing SDK rather than auto-downloading, and (3) document precisely what endpoints the SDK contacts and how secrets are used. If you cannot obtain those assurances, treat this skill as untrusted and run it only in isolation.

Capability Analysis

Type: OpenClaw Skill Name: byted-las-vlm-video Version: 1.0.1 The skill bundle contains a high-risk initialization pattern in `scripts/env_init.sh`, which downloads and installs a Python wheel file directly from a remote URL (volces.com) using `pip install`. This represents a significant supply chain risk. Additionally, `SKILL.md` instructs the agent to encourage users to store sensitive API keys in a local `env.sh` file if the UI masks them, which is a poor security practice for secret management. While these behaviors are plausibly intended for the stated purpose of integrating with Volcengine LAS, they constitute high-risk capabilities and vulnerabilities.

Capability Tags

requires-sensitive-credentials

Capability Assessment

ℹ Purpose & Capability

The name/description (Volcengine LAS VLM video analysis) aligns with the actions described (upload video, call LAS operator, parse results). However the registry metadata claims no required environment variables or credentials while SKILL.md and scripts explicitly require LAS_API_KEY and LAS_REGION and optionally VOLCENGINE_ACCESS_KEY/VOLCENGINE_SECRET_KEY for certain flows — a direct mismatch between declared requirements and actual instructions.

⚠ Instruction Scope

SKILL.md instructs the agent to source scripts/env_init.sh, call ffprobe, lasutil, upload local files, and to read/create .env or env.sh containing LAS_API_KEY. Those actions are within the video-analysis purpose, but the instructions also cause the agent to auto-update/install an SDK and fetch a remote manifest (network operations) which go beyond purely 'read and call API' and introduce risk. The skill also tells the user to store secrets in a local env file and the scripts will source it automatically — this increases the blast radius if the environment is shared or if files are mislocated.

⚠ Install Mechanism

There is no install spec in the registry, but scripts/env_init.sh fetches a remote manifest via curl and pip-installs a wheel from a specific TOS URL (https://las-ai-cn-beijing-online.tos-cn-beijing.volces.com/...). This means the agent will download and execute remote code during runtime (pip install of a wheel). Installing code from a custom URL (not a standard public release host with signature checks present here) is higher risk and should be explicitly declared.

⚠ Credentials

SKILL.md requires LAS_API_KEY and LAS_REGION (and conditionally VOLCENGINE_ACCESS_KEY / VOLCENGINE_SECRET_KEY) but the skill metadata lists no required env vars. The requested credentials are plausible for a Volcengine integration (LAS API key for auth, TOS keys for object download), but the mismatch between declared and actual required secrets is problematic. The skill also instructs users to place keys in local files which the scripts will source automatically — this is convenient but increases accidental exposure if the working directory is not private.

ℹ Persistence & Privilege

The skill does create local artifacts during runtime: it creates a temporary workdir, may create/activate a .las_venv virtualenv in the project root, and pip-installs the SDK into that venv. always:false (no forced global presence). The behavior is typical for CLI-based skills but should be noted because it writes files and installs packages in the environment where it runs.

Version History

v1.0.1

Version 1.0.1 of byted-las-vlm-video - Added scripts for environment initialization, format checking, output generation, and background polling to enhance workflow automation. - Introduced evaluation and reference documentation, including a price list for more transparent user communication. - Updated the SKILL.md with a strict, stepwise workflow emphasizing pre-execution checks, price estimation, and detailed output instructions. - Marked the main script (skill.py) for removal, shifting from Python-centric to modular shell-script orchestration. - Strengthened environment/configuration validation, user interaction steps, and explicit disclosure of pricing methodology. - Improved documentation for both technical setup and user-facing best practices.

v1.0.0

Initial release of byted-las-vlm-video. - Provides video content understanding via Doubao models. - Supports analyzing, summarizing, and Q&A about video content using natural language prompts. - Accepts public or intranet-accessible video URLs as input. - Returns model-generated responses and video compression metadata. - Requires LAS_API_KEY for authentication and supports region selection. - Includes script for easy execution and workflow integration.

Metadata

Slug byted-las-vlm-video

Version 1.0.1

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 2

Frequently Asked Questions

What is Byted Las Vlm Video?

Analyzes and understands video content using Volcengine LAS Doubao vision-language models (VLM). Multimodal AI video analysis, video comprehension, and visua... It is an AI Agent Skill for Claude Code / OpenClaw, with 146 downloads so far.

How do I install Byted Las Vlm Video?

Run "/install byted-las-vlm-video" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Byted Las Vlm Video free?

Yes, Byted Las Vlm Video is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Byted Las Vlm Video support?

Byted Las Vlm Video is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Byted Las Vlm Video?

It is built and maintained by volcengine-skills (@volcengine-skills); the current version is v1.0.1.

More Skills

Byted Las Vlm Video

LAS 视频内容理解（las_vlm_video）