Description

Speech recognition CLI for AI agent automation. Transcribe audio from stdin, files, or URLs.

README (SKILL.md)

asr-claw

Name: Asr Claw
Author: dionren

Speech recognition CLI for AI agent automation. Transcribe audio streams from stdin, files, or URLs with multiple ASR engines — local and cloud.

Triggers

User wants to transcribe audio, speech, or voice to text
User needs speech recognition or ASR
User wants to convert audio/voice recordings to text
User wants to monitor live audio / livestream speech
User asks about 语音识别、语音转文字、转写、直播语音
adb-claw audio capture output needs to be transcribed
User wants subtitles (SRT/VTT) generated from audio

Binary

The asr-claw binary is located at ${CLAUDE_PLUGIN_ROOT}/bin/asr-claw.

If it does not exist, the SessionStart hook will build or download it automatically.

Setup

Quick Start (Mac)

# Install the qwen-asr engine (builds C binary + downloads 0.6B model ~1.9GB)
asr-claw engines install qwen-asr

# Verify
asr-claw engines list
asr-claw doctor

OpenClaw Setup

After installing the skill via ClawHub, configure settings:

# Set default language (default: zh)
claw config set asr-claw.default_lang en

# Use a larger model
claw config set asr-claw.model Qwen/Qwen3-ASR-1.7B

# For China users — set HuggingFace mirror
claw config set asr-claw.hf_mirror https://hf-mirror.com

# Custom model path (e.g., shared NAS)
claw config set asr-claw.model_path /mnt/models/Qwen3-ASR-0.6B

# Re-run install after changing model settings
asr-claw engines install qwen-asr

Settings are stored in ~/.asr-claw/config.yaml:

default:
  engine: qwen-asr
  lang: zh
  format: json

engines:
  qwen-asr:
    binary: ~/.asr-claw/bin/qwen-asr
    model_path: ~/.asr-claw/models/Qwen3-ASR-0.6B

Cloud Engines (no local model needed)

# OpenAI Whisper API
export OPENAI_API_KEY=sk-...
asr-claw transcribe --file audio.wav --engine openai

# Volcengine Doubao (火山引擎)
export DOUBAO_API_KEY=...
asr-claw transcribe --file audio.wav --engine doubao

# Deepgram (native streaming)
export DEEPGRAM_API_KEY=...
asr-claw transcribe --file audio.wav --engine deepgram

Commands

transcribe — Core: audio to text

# File transcription
asr-claw transcribe --file meeting.wav --lang zh

# Pipe from stdin
cat audio.wav | asr-claw transcribe --lang zh

# Streaming (real-time, from adb-claw or ffmpeg)
adb-claw audio capture --stream --duration 60000 | asr-claw transcribe --stream --lang zh

# Subtitle output
asr-claw transcribe --file lecture.wav --format srt > lecture.srt
asr-claw transcribe --file lecture.wav --format vtt > lecture.vtt

# Specify engine
asr-claw transcribe --file audio.wav --engine whisper --lang en

Flags:

Flag	Default	Description
`--file \x3Cpath>`	stdin	Input audio file
`--stream`	false	Streaming mode (real-time)
`--lang \x3Ccode>`	zh	Language code
`--engine \x3Cname>`	auto	ASR engine
`--format \x3Cfmt>`	json	Output: json, text, srt, vtt
`--chunk \x3Csec>`	0	Fixed-time chunking (disables VAD)
`--rate \x3Chz>`	16000	Sample rate for raw PCM input

engines — Manage ASR engines

asr-claw engines list                    # List all engines + status
asr-claw engines install qwen-asr       # Install local engine (Mac)
asr-claw engines info qwen-asr          # Engine details
asr-claw engines start qwen3-asr        # Start vLLM service engine
asr-claw engines stop qwen3-asr         # Stop service engine
asr-claw engines status                  # Running engines

doctor — Environment check

asr-claw doctor    # Check platform, engines, dependencies

Engine Matrix

Engine	Type	Mac	GPU	Streaming	Install
qwen-asr	Local CLI	Yes	No (Accelerate)	VAD	`engines install qwen-asr`
qwen3-asr	vLLM Service	No	Yes (CUDA)	Native	`engines start qwen3-asr`
whisper	Local CLI	Yes	No	VAD	Manual
doubao	Cloud API	Yes	—	No	Set DOUBAO_API_KEY
openai	Cloud API	Yes	—	No	Set OPENAI_API_KEY
deepgram	Cloud API	Yes	—	Native	Set DEEPGRAM_API_KEY

Output Format

All commands output JSON envelope:

{
  "ok": true,
  "command": "transcribe",
  "data": {
    "segments": [{"index": 0, "start": 0.0, "end": 2.5, "text": "..."}],
    "full_text": "...",
    "engine": "qwen-asr",
    "audio_duration_sec": 5.5
  },
  "duration_ms": 1230,
  "timestamp": "2026-03-13T10:00:00Z"
}

Use -o text for plain text, -o quiet for silent.

With adb-claw

# Real-time transcription from Android device
adb-claw audio capture --stream --duration 60000 | asr-claw transcribe --stream --lang zh

# Record then transcribe
adb-claw audio capture --duration 30000 --file recording.wav
asr-claw transcribe --file recording.wav --lang zh

# Save audio + transcribe simultaneously
adb-claw audio capture --stream --duration 0 | tee backup.wav | asr-claw transcribe --stream

Usage Guidance

This skill appears to be a legitimate speech-to-text tool, but before installing you should: (1) verify the binary release and checksum on the GitHub repo (prefer a pinned release rather than 'latest'), (2) be prepared for large model downloads (GBs) and possible local builds, (3) only provide cloud API keys (OpenAI/Deepgram/Doubao) if you trust the service and the plugin, (4) avoid pointing model_path to directories with sensitive files, and (5) confirm whether the platform will actually perform the auto-download/installation described in SKILL.md since the registry metadata omitted an install spec. If you need higher assurance, run the plugin in an isolated environment or review the released binary artifact contents first.

Capability Analysis

Type: OpenClaw Skill Name: asr-claw Version: 1.1.1 The asr-claw skill is a legitimate speech-to-text utility providing integration with local models (Qwen-ASR) and cloud APIs (OpenAI, Deepgram). The installation process uses a standard binary download from a GitHub repository (llm-net/asr-claw) with checksum verification, and the requested permissions and environment variables (API keys) are consistent with its stated purpose of audio transcription.

Capability Assessment

ℹ Purpose & Capability

The name/description (ASR CLI for transcription) aligns with the commands and engines listed. Requiring an asr-claw binary and supporting local/cloud ASR engines is expected. However, the registry metadata claims 'no install spec' while the SKILL.md includes an install block (download from GitHub releases) and auto-install behavior, which is an inconsistency worth noting.

⚠ Instruction Scope

SKILL.md instructs the agent to download/run binaries, install large local models (~1.9–3.4GB), read/write settings in ~/.asr-claw/config.yaml, and use cloud APIs. It also shows examples that rely on OPENAI_API_KEY, DEEPGRAM_API_KEY, DOUBAO_API_KEY and other env vars — but those credentials are not declared in the registry metadata. The instructions allow pointing model_path at arbitrary local directories, which grants the skill access to user files in that path. These behaviors expand the runtime scope beyond what the registry summary stated.

ℹ Install Mechanism

SKILL.md specifies a download from GitHub releases (https://github.com/llm-net/asr-claw/releases/latest/download/...), dest bin/asr-claw, and a checksum_url. GitHub releases is a normal source, and a checksum is provided, but the use of the 'latest' release (rather than a pinned version) increases upgrade/attack surface. The registry itself omitted an install spec even though SKILL.md contains one — an inconsistency to verify.

⚠ Credentials

Cloud-engine examples require provider API keys (OpenAI, Deepgram, Doubao) which are reasonable for cloud transcription, but the registry lists no required env vars. The SKILL.md therefore expects undeclared credentials. Also the settings allow specifying arbitrary model_path and binary_path locations, which can expose local filesystem contents if misused. The requested access is plausible but not properly declared.

✓ Persistence & Privilege

The skill is not marked 'always:true' and does not request system-wide privileges. It may create files under its own plugin directory and ~/.asr-claw (configs and models) as part of normal operation, which is expected for a local ASR tool.

Version History

v1.1.1

Add GitHub Pages deployment with custom domain (asr-claw.llm.net)

v1.1.0

Fix ClawHub install: use kind:download with SHA256 checksums instead of bash script. Add homepage and requires.bins metadata.

v1.0.0

v1.0.0: Official release with pre-built qwen-asr binaries, bilingual website, zero build dependencies for engine install

Metadata

Slug asr-claw

Version 1.1.1

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 3

Frequently Asked Questions

What is Asr Claw?

Speech recognition CLI for AI agent automation. Transcribe audio from stdin, files, or URLs. It is an AI Agent Skill for Claude Code / OpenClaw, with 299 downloads so far.

How do I install Asr Claw?

Run "/install asr-claw" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Asr Claw free?

Yes, Asr Claw is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Asr Claw support?

Asr Claw is cross-platform and runs anywhere OpenClaw / Claude Code is available (darwin, linux).

Who created Asr Claw?

It is built and maintained by 任嘉 (@dionren); the current version is v1.1.1.

More Skills

Asr Claw