← Back to Skills Marketplace

Volcengine STT

Name: Volcengine STT
Author: reed1898

by Reed · GitHub ↗ · v0.2.1

cross-platform ⚠ suspicious

506

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install volcengine-stt

Description

Transcribe audio to text using Volcano Engine (Volcengine/ARK) speech-to-text APIs. Use when the user wants to replace Whisper/OpenAI STT with Volcengine, tr...

Usage Guidance

Do not assume this skill uses Volcengine/ARK based on its name or SKILL.md. The bundled script actually uploads audio to openspeech.bytedance.com and expects VOLC_APP_ID / VOLC_ACCESS_TOKEN (or reads ~/.openclaw/openclaw.json) — a mismatch that may be accidental or intentional. Before installing: 1) Ask the publisher which provider the skill is intended for and request corrected docs or code. 2) If you must test it, run the script in a sandbox or isolated account and with non-sensitive test audio. 3) Don't provide production credentials until the provider/credential mapping is clarified; if you already supplied keys, consider rotating them. 4) If you expect Volcengine/ARK, either obtain a version that actually calls the ARK endpoints or modify the script accordingly. 5) Be aware the script transmits local audio and may read OpenClaw config for secrets — only run it where you trust that destination and have reviewed the code.

Capability Analysis

Type: OpenClaw Skill Name: volcengine-stt Version: 0.2.1 The skill is designed for transcribing audio to text using Volcengine/ByteDance APIs, a legitimate function. The `SKILL.md` provides clear instructions without any prompt injection attempts. The `transcribe.sh` script uses standard tools (`curl`, `jq`, `base64`) safely, handles API keys from environment variables or `~/.openclaw/openclaw.json` securely via HTTP headers, and connects to legitimate Volcengine/ByteDance API endpoints (e.g., `https://openspeech.bytedance.com/api/v3/auc/bigmodel/`). There is no evidence of malicious intent such as data exfiltration, unauthorized execution, persistence mechanisms, or obfuscation. The script's use of `jq -n --arg` for constructing JSON payloads mitigates injection risks.

Capability Assessment

⚠ Purpose & Capability

SKILL.md and the skill name promise Volcengine (ARK) STT and list ARK_API_KEY / ARK_BASE_URL, but the runnable script posts base64 audio to openspeech.bytedance.com endpoints and uses VOLC_APP_ID / VOLC_ACCESS_TOKEN / VOLC_RESOURCE_ID headers. This is a clear mismatch: either the README is wrong or the script implements a different provider.

⚠ Instruction Scope

The runtime script will read credentials from environment variables or from ~/.openclaw/openclaw.json (via jq), base64-encode local audio, and upload it to external endpoints (openspeech.bytedance.com). SKILL.md does not document the config-file fallback or the actual network endpoints used, so users may be unaware their audio and local config will be transmitted to Bytedance servers.

✓ Install Mechanism

There is no install spec (instruction-only with an included script). No additional packages are automatically downloaded or extracted. The script requires common system tools (curl, jq, base64, uuidgen or /proc UUID) but does not perform external installs.

⚠ Credentials

SKILL.md declares ARK_API_KEY (and ARK_* env vars) as required, but the script actually requires VOLC_APP_ID and VOLC_ACCESS_TOKEN (and optionally VOLC_RESOURCE_ID or values from ~/.openclaw/openclaw.json). The skill therefore asks for credentials that don't match the code, and it also accesses a user config file path not mentioned in the docs.

✓ Persistence & Privilege

The skill does not request permanent 'always' inclusion and does not modify other skills or system-wide settings. Its only elevated access is reading a local OpenClaw config fallback file (~/.openclaw/openclaw.json) to obtain credentials.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install volcengine-stt
After installation, invoke the skill by name or use /volcengine-stt
Provide required inputs per the skill's parameter spec and get structured output

Version History

v0.2.1

Fix config fallback to prioritize skills.entries.volcengine-stt.env.VOLC_*; keep standard API flow stable

v0.2.0

Switch default to Volcengine standard AUC submit/query mode; add flash mode option; config fallback for appId/accessToken/resourceId

v0.1.0

Initial release: reusable Volcengine/ARK speech-to-text skill for OpenClaw agents

Metadata

Slug volcengine-stt

Version 0.2.1

License —

All-time Installs 5

Active Installs 5

Total Versions 3

Frequently Asked Questions

What is Volcengine STT?

Transcribe audio to text using Volcano Engine (Volcengine/ARK) speech-to-text APIs. Use when the user wants to replace Whisper/OpenAI STT with Volcengine, tr... It is an AI Agent Skill for Claude Code / OpenClaw, with 506 downloads so far.

How do I install Volcengine STT?

Run "/install volcengine-stt" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Volcengine STT free?

Yes, Volcengine STT is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Volcengine STT support?

Volcengine STT is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Volcengine STT?

It is built and maintained by Reed (@reed1898); the current version is v0.2.1.

More Skills