← Back to Skills Marketplace
reed1898

Volcengine STT

by Reed · GitHub ↗ · v0.2.1
cross-platform ⚠ suspicious
506
Downloads
2
Stars
5
Active Installs
3
Versions
Install in OpenClaw
/install volcengine-stt
Description
Transcribe audio to text using Volcano Engine (Volcengine/ARK) speech-to-text APIs. Use when the user wants to replace Whisper/OpenAI STT with Volcengine, tr...
Usage Guidance
Do not assume this skill uses Volcengine/ARK based on its name or SKILL.md. The bundled script actually uploads audio to openspeech.bytedance.com and expects VOLC_APP_ID / VOLC_ACCESS_TOKEN (or reads ~/.openclaw/openclaw.json) — a mismatch that may be accidental or intentional. Before installing: 1) Ask the publisher which provider the skill is intended for and request corrected docs or code. 2) If you must test it, run the script in a sandbox or isolated account and with non-sensitive test audio. 3) Don't provide production credentials until the provider/credential mapping is clarified; if you already supplied keys, consider rotating them. 4) If you expect Volcengine/ARK, either obtain a version that actually calls the ARK endpoints or modify the script accordingly. 5) Be aware the script transmits local audio and may read OpenClaw config for secrets — only run it where you trust that destination and have reviewed the code.
Capability Analysis
Type: OpenClaw Skill Name: volcengine-stt Version: 0.2.1 The skill is designed for transcribing audio to text using Volcengine/ByteDance APIs, a legitimate function. The `SKILL.md` provides clear instructions without any prompt injection attempts. The `transcribe.sh` script uses standard tools (`curl`, `jq`, `base64`) safely, handles API keys from environment variables or `~/.openclaw/openclaw.json` securely via HTTP headers, and connects to legitimate Volcengine/ByteDance API endpoints (e.g., `https://openspeech.bytedance.com/api/v3/auc/bigmodel/`). There is no evidence of malicious intent such as data exfiltration, unauthorized execution, persistence mechanisms, or obfuscation. The script's use of `jq -n --arg` for constructing JSON payloads mitigates injection risks.
Capability Assessment
Purpose & Capability
SKILL.md and the skill name promise Volcengine (ARK) STT and list ARK_API_KEY / ARK_BASE_URL, but the runnable script posts base64 audio to openspeech.bytedance.com endpoints and uses VOLC_APP_ID / VOLC_ACCESS_TOKEN / VOLC_RESOURCE_ID headers. This is a clear mismatch: either the README is wrong or the script implements a different provider.
Instruction Scope
The runtime script will read credentials from environment variables or from ~/.openclaw/openclaw.json (via jq), base64-encode local audio, and upload it to external endpoints (openspeech.bytedance.com). SKILL.md does not document the config-file fallback or the actual network endpoints used, so users may be unaware their audio and local config will be transmitted to Bytedance servers.
Install Mechanism
There is no install spec (instruction-only with an included script). No additional packages are automatically downloaded or extracted. The script requires common system tools (curl, jq, base64, uuidgen or /proc UUID) but does not perform external installs.
Credentials
SKILL.md declares ARK_API_KEY (and ARK_* env vars) as required, but the script actually requires VOLC_APP_ID and VOLC_ACCESS_TOKEN (and optionally VOLC_RESOURCE_ID or values from ~/.openclaw/openclaw.json). The skill therefore asks for credentials that don't match the code, and it also accesses a user config file path not mentioned in the docs.
Persistence & Privilege
The skill does not request permanent 'always' inclusion and does not modify other skills or system-wide settings. Its only elevated access is reading a local OpenClaw config fallback file (~/.openclaw/openclaw.json) to obtain credentials.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install volcengine-stt
  3. After installation, invoke the skill by name or use /volcengine-stt
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v0.2.1
Fix config fallback to prioritize skills.entries.volcengine-stt.env.VOLC_*; keep standard API flow stable
v0.2.0
Switch default to Volcengine standard AUC submit/query mode; add flash mode option; config fallback for appId/accessToken/resourceId
v0.1.0
Initial release: reusable Volcengine/ARK speech-to-text skill for OpenClaw agents
Metadata
Slug volcengine-stt
Version 0.2.1
License
All-time Installs 5
Active Installs 5
Total Versions 3
Frequently Asked Questions

What is Volcengine STT?

Transcribe audio to text using Volcano Engine (Volcengine/ARK) speech-to-text APIs. Use when the user wants to replace Whisper/OpenAI STT with Volcengine, tr... It is an AI Agent Skill for Claude Code / OpenClaw, with 506 downloads so far.

How do I install Volcengine STT?

Run "/install volcengine-stt" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Volcengine STT free?

Yes, Volcengine STT is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Volcengine STT support?

Volcengine STT is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Volcengine STT?

It is built and maintained by Reed (@reed1898); the current version is v0.2.1.

💬 Comments