← 返回 Skills 市场
volcengine-skills

Byted Las Asr Pro

作者 volcengine-skills · GitHub ↗ · v1.0.1 · MIT-0
cross-platform ⚠ suspicious
149
总下载
0
收藏
0
当前安装
2
版本数
在 OpenClaw 中安装
/install byted-las-asr-pro
功能描述
ASR / STT / speech recognition / voice recognition engine powered by Volcengine LAS. Transcribes and converts speech to text from audio and video files — ext...
使用说明 (SKILL.md)

LAS 语音识别(las_asr_pro

支持将音频/视频转写为文字,可选说话人分离、情绪/性别识别、多语种自动识别。

设计模式

本 skill 主要采用:

  • Tool Wrapper:封装 lasutil CLI 调用
  • Pipeline:包含 Step 0 → Step N 的顺序工作流

核心 API 与配置

  • 算子 ID: las_asr_pro
  • API: 异步(submitpoll
  • 环境变量: LAS_API_KEY (必填)

详细参数与接口定义见 references/api.md

Gotchas

  • 格式易错audio.format 必须是容器格式(wav/mp3/m4a),非编码格式。不确定时用 ffprobe 确认。
  • 密钥安全:若聊天框屏蔽密钥,让用户在当前目录创建 env.sh 并写入 export LAS_API_KEY="...",SDK 会自动读取。
  • 免责声明:最终回复结果时必须包含:"本方式的计费均为预估计费,与实际费用有差距,实际费用以运行后火山产生的账单为准。计费说明请参考 Volcengine LAS 定价。",且禁止使用"实际费用"字眼描述预估价。

工作流(严格按步骤执行)

复制此清单并跟踪进度:

执行进度:
- [ ] Step 0: 前置检查
- [ ] Step 1: 初始化与准备
- [ ] Step 2: 预估价格
- [ ] Step 3: 提交任务
- [ ] Step 4: 异步查询
- [ ] Step 5: 结果呈现

Step 0: 前置检查(⚠️ 必须在第一轮对话中完成)

在接受用户的任务后,不要立即开始执行,必须首先进行以下环境检查:

  1. 检查 LAS_API_KEYLAS_REGION:确认环境变量或 .env 中是否已配置。
    • 若无,必须立即向用户索要(提示:LAS_REGION 常见为 cn-beijing)。
    • 注意LAS_REGION 必须与您的 API Key 及 TOS Bucket 所在的地域完全一致。如果用户中途切换了 Region,必须提醒用户其 TOS Bucket 也需对应更换,否则会导致权限异常或上传失败。
  2. 检查输入路径
    • 如果用户要求处理的是本地文件,则需要先通过 File API 上传至 TOS(只需 LAS_API_KEY,无需额外 TOS 凭证)。
    • 如果算子的输出结果存放在 TOS 上,且用户需要下载回本地,则需要 VOLCENGINE_ACCESS_KEYVOLCENGINE_SECRET_KEY。对于仅需要上传输入文件的场景,TOS 凭证不再必须
  3. 确认无误后:才能进入下一步。

Step 1: 初始化与准备

环境初始化(Agent 必做)

# 执行统一的环境初始化与更新脚本(会自动创建/激活虚拟环境,并检查更新)
source "$(dirname "$0")/scripts/env_init.sh" las_asr_pro
workdir=$LAS_WORKDIR

如果网络问题导致更新失败,脚本会跳过检查,使用本地已安装的 SDK 继续执行。

  • 处理本地文件时:先本地检查格式和时长,预估价格,用户确认后再上传:
    # 提前检查容器格式(避免参数错误)
    ./scripts/check_format.sh \x3Clocal_path>
    # 本地使用 ffprobe 获取时长(无需上传即可预估价格)
    ffprobe -v error -show_entries format=duration -of default=noprint_wrappers=1:noprint_section=1 \x3Clocal_path>
    
    计算预估价格并等待用户确认后,再执行上传:
    # 用户确认后,上传到 TOS
    lasutil file-upload \x3Clocal_path>
    
    上传成功后返回 JSON,取其中的 presigned_url(HTTPS 预签名下载链接,24 小时有效)传给算子作为输入 URL。

Step 2: 预估价格(⚠️ 必须获得用户确认)

  1. 读取 references/prices.md 获取最新计费标准。
  2. 优先本地获取时长(避免不必要上传):
    # 使用 ffprobe 本地获取
    duration_sec=$(ffprobe -v error -show_entries format=duration -of default=noprint_wrappers=1:noprint_section=1 \x3Clocal_path>)
    
    如果 ffprobe 失败,再使用 lasutil 远程获取:
    lasutil media-duration \x3Cpresigned_url>
    
  3. 根据时长和模型单价计算总价,将计费单价与预估总价一并告知用户并强制暂停执行,明确等待用户回复确认。在用户明确回复"继续"、"确认"等同意指令前,绝对禁止进入下一步(执行/提交任务)。提示:预估仅供参考,实际以火山账单为准。计费说明请参考 Volcengine LAS 定价

Step 3: 提交任务 (Submit)

构造基础 data.json(仅需必填项,其他按需添加):

{
  "audio": {"url": "\x3Cpresigned_url>", "format": "wav"},
  "request": {"model_name": "bigmodel"}
}
data=$(cat "$workdir/data.json")
lasutil submit las_asr_pro "$data" > "$workdir/submit.json"

记录返回的 metadata.task_id

Step 4: 异步查询 (Poll)

⚠️ 异步任务与后台轮询约束

  • 如果你当前的环境支持后台任务/异步长效运行:你可以利用环境提供的后台能力(例如发起后台轮询任务),并在任务完成后主动将结果返回给用户。
  • 如果你当前的环境不支持长效后台任务(如普通的单轮对话沙箱),且直接 sleep 循环会导致超时崩溃:绝对禁止在代码中执行死循环等待! 此时必须立即向用户输出 Task ID 并结束当前轮次,告知用户:"任务已提交,请稍后向我询问进度"。

使用优化的后台轮询脚本(动态间隔 + 自动提取结果)

mkdir -p "./output/{task_id}"
./scripts/poll_background.sh {task_id} "./output/{task_id}" & disown

脚本特性:

  • 动态间隔:前 5 次 30s,5-10 次 60s,10 次后 120s,减少不必要轮询
  • 自动提取:完成后自动生成 transcript.txt / utterances.json / utterances.csv
  • 日志记录:完整轮询历史保存在 poll.log

手动查询示例:

lasutil poll las_asr_pro {task_id} > "./output/{task_id}/result.json"
  • COMPLETED → 结果已自动提取保存到 ./output/{task_id}/
  • RUNNING/PENDING → 继续等待后台轮询
  • FAILED → 返回错误。

Step 5: 结果呈现

处理结果(后台轮询已自动完成提取):

# 自动生成结果展示 markdown(包含必填计费声明)
./scripts/generate_result.md.sh {task_id} "./output/{task_id}" \x3Cestimated_price>

生成内容包括:

  • 任务信息卡片
  • 识别统计(时长/语种/字数)
  • 文本预览(前 500 字)
  • 自动包含计费声明

输出文件结构(已由 poll_background.sh 自动生成):

./output/{task_id}/
├── result.json       # 完整 API 响应
├── transcript.txt    # 完整识别文本
├── utterances.json   # 分句原始数据(若开启)
└── utterances.csv    # 分句说话人 CSV(若开启)

上传结果文件(可选):

# 上传文本文件供用户下载
lasutil file-upload "./output/{task_id}/transcript.txt"
lasutil file-upload "./output/{task_id}/utterances.csv"

向用户展示

  1. 使用生成的 markdown 模板
  2. 展示前 500 字转写文本
  3. 提供本地文件路径
  4. 提供签名下载链接(如上传成功)
  5. 计费声明已自动包含在模板中

审查标准

执行完成后,Agent 应自检:

  1. 环境变量是否正确配置
  2. 输入文件是否成功上传
  3. 输出结果是否正确呈现给用户
  4. 计费声明是否包含
安全使用建议
This skill appears to implement a Volcengine LAS transcribe pipeline, but there are several red flags you should address before installing or running it: - Metadata mismatch: The registry lists no required env vars or binaries, but SKILL.md and scripts require LAS_API_KEY, LAS_REGION and rely on lasutil, ffprobe, jq, python3, pip, etc. Treat the metadata as incomplete until corrected. - Runtime code fetch: The env_init.sh script fetches a manifest and pip-installs a wheel from a remote URL at runtime. That downloads and runs third-party code on your system — only proceed if you trust the exact host and can verify the wheel (checksums/signature). - Secrets exposure: The scripts source a project-level .env file; do not keep unrelated secrets in that .env. The skill may also request additional VOLCENGINE_* credentials for downloading results — provide them only when absolutely necessary and preferably in a scoped/testing account. - Persistence: The skill creates a virtualenv and can install/upgrade packages and spawn background pollers. Run it first in a disposable or isolated environment (container, VM) to observe behavior. Concrete steps before use: 1. Ask the publisher for provenance: where does the manifest/wheel come from, and can they provide a signed release or checksum? Does this skill have an official homepage or vendor contact? 2. Request that registry metadata be corrected to list required env vars and binaries. 3. Inspect the remote manifest and wheel before allowing env_init.sh to install them; prefer manual installation from a vetted source. 4. Run the skill in an isolated environment (container) and avoid putting other secrets in project .env. 5. If you must supply production credentials, consider using least-privilege or temporary keys and monitoring billing/usage tightly. If the publisher can supply a verifiable release (GitHub release or signed package), and the metadata is corrected to list env/binary requirements, the assessment could move toward benign; without that, treat the skill as suspicious.
功能分析
Type: OpenClaw Skill Name: byted-las-asr-pro Version: 1.0.1 The skill bundle contains a high-risk initialization script `scripts/env_init.sh` that downloads and installs a Python wheel file and a manifest directly from a remote URL (volces.com). While the domain is associated with the legitimate Volcengine service described in the documentation, fetching and executing remote artifacts during environment setup is a significant supply chain risk. No clear evidence of intentional malice or data exfiltration was found, but the automated remote installation of code makes this bundle suspicious.
能力标签
requires-sensitive-credentials
能力评估
Purpose & Capability
The SKILL.md and scripts clearly expect Volcengine credentials (LAS_API_KEY, LAS_REGION) and optional TOS credentials (VOLCENGINE_ACCESS_KEY, VOLCENGINE_SECRET_KEY) as part of normal operation, and call out use of lasutil/ffprobe/jq/python3, but the registry metadata declared no required env vars and no required binaries. That mismatch between declared requirements and what the skill actually needs is incoherent and could mislead users about what secrets/tools are necessary.
Instruction Scope
Runtime instructions direct the agent to: source scripts/env_init.sh (which fetches a remote manifest and may install/update a Python wheel), read local .env files, create/activate a virtualenv in the project, call lasutil/ffprobe, upload local files to TOS, and optionally spawn a background poller. The instructions read files from the project root (./.env) and can cause network calls to vendor-hosted endpoints; they also instruct the agent to auto-upload user files to TOS. These actions go beyond simple 'call an API' guidance and include network fetch and install steps plus access to local environment files.
Install Mechanism
There is no static install spec in the registry, but scripts/env_init.sh fetches a remote manifest via curl and unconditionally pip-installs a wheel from https://las-ai-cn-beijing-online.tos-cn-beijing.volces.com/... — a runtime download-and-install of an archive from an external URL. Dynamic installation of code from an external host (extract/install) is higher-risk and should be declared explicitly and verifiable (signatures, known host).
Credentials
The skill legitimately needs LAS_API_KEY and LAS_REGION for Volcengine API access and may need TOS access keys if results are to be pulled back — these are appropriate for the described purpose. However the registry metadata fails to declare these env vars, and the scripts source a project-level .env (potentially exposing any other secrets present). That omission plus the ability to read project .env makes the requested environment access disproportionate to what the registry advertises and increases risk of inadvertent secret exposure.
Persistence & Privilege
The skill will create/activate a virtualenv (.las_venv) in the project root and may pip install or upgrade SDK packages on first run, and the poller can be disowned to run in background. While always:false (no forced global install), these behaviors modify the local environment and persist artifacts (venv, temporary LAS_WORKDIR, output files). Auto-updating/installing packages and spawning background processes are meaningful privileges and should be made explicit to users.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install byted-las-asr-pro
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /byted-las-asr-pro 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.1
byted-las-asr-pro 1.0.1 - Overhauled documentation with detailed step-by-step workflow, environment checks, safety notes, and explicit billing disclaimer requirements. - Enhanced feature list: emotion recognition, sentiment/gender detection, and robust support for batch/large-scale jobs. - Added new scripts for environment setup, format checking, background polling, and markdown result generation. - Introduced evaluation references and price info for accurate billing estimates and user transparency. - Removed monolithic script (`skill.py`) in favor of modular, script-driven workflow. - Reinforced compliance and user confirmation steps, especially for sensitive operations and cost-related actions.
v1.0.0
byted-las-asr-pro 1.0.0 - Initial release of the audio transcription skill. - Supports transcription of audio files with speaker diarization, language detection, and multiple formats (wav, mp3, m4a, aac, flac). - Wraps LAS-ASR-PRO async API (`submit` and `poll`) into a reusable script workflow. - Requires environment variable LAS_API_KEY for authentication. - Includes CLI usage examples for submitting and polling transcription tasks.
元数据
Slug byted-las-asr-pro
版本 1.0.1
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 2
常见问题

Byted Las Asr Pro 是什么?

ASR / STT / speech recognition / voice recognition engine powered by Volcengine LAS. Transcribes and converts speech to text from audio and video files — ext... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 149 次。

如何安装 Byted Las Asr Pro?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install byted-las-asr-pro」即可一键安装,无需额外配置。

Byted Las Asr Pro 是免费的吗?

是的,Byted Las Asr Pro 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Byted Las Asr Pro 支持哪些平台?

Byted Las Asr Pro 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Byted Las Asr Pro?

由 volcengine-skills(@volcengine-skills)开发并维护,当前版本 v1.0.1。

💬 留言讨论