← Back to Skills Marketplace
dionren

Asr Claw

by 任嘉 · GitHub ↗ · v1.1.1 · MIT-0
darwinlinux ⚠ suspicious
299
Downloads
0
Stars
0
Active Installs
3
Versions
Install in OpenClaw
/install asr-claw
Description
Speech recognition CLI for AI agent automation. Transcribe audio from stdin, files, or URLs.
README (SKILL.md)

asr-claw

Speech recognition CLI for AI agent automation. Transcribe audio streams from stdin, files, or URLs with multiple ASR engines — local and cloud.

Triggers

  • User wants to transcribe audio, speech, or voice to text
  • User needs speech recognition or ASR
  • User wants to convert audio/voice recordings to text
  • User wants to monitor live audio / livestream speech
  • User asks about 语音识别、语音转文字、转写、直播语音
  • adb-claw audio capture output needs to be transcribed
  • User wants subtitles (SRT/VTT) generated from audio

Binary

The asr-claw binary is located at ${CLAUDE_PLUGIN_ROOT}/bin/asr-claw.

If it does not exist, the SessionStart hook will build or download it automatically.

Setup

Quick Start (Mac)

# Install the qwen-asr engine (builds C binary + downloads 0.6B model ~1.9GB)
asr-claw engines install qwen-asr

# Verify
asr-claw engines list
asr-claw doctor

OpenClaw Setup

After installing the skill via ClawHub, configure settings:

# Set default language (default: zh)
claw config set asr-claw.default_lang en

# Use a larger model
claw config set asr-claw.model Qwen/Qwen3-ASR-1.7B

# For China users — set HuggingFace mirror
claw config set asr-claw.hf_mirror https://hf-mirror.com

# Custom model path (e.g., shared NAS)
claw config set asr-claw.model_path /mnt/models/Qwen3-ASR-0.6B

# Re-run install after changing model settings
asr-claw engines install qwen-asr

Settings are stored in ~/.asr-claw/config.yaml:

default:
  engine: qwen-asr
  lang: zh
  format: json

engines:
  qwen-asr:
    binary: ~/.asr-claw/bin/qwen-asr
    model_path: ~/.asr-claw/models/Qwen3-ASR-0.6B

Cloud Engines (no local model needed)

# OpenAI Whisper API
export OPENAI_API_KEY=sk-...
asr-claw transcribe --file audio.wav --engine openai

# Volcengine Doubao (火山引擎)
export DOUBAO_API_KEY=...
asr-claw transcribe --file audio.wav --engine doubao

# Deepgram (native streaming)
export DEEPGRAM_API_KEY=...
asr-claw transcribe --file audio.wav --engine deepgram

Commands

transcribe — Core: audio to text

# File transcription
asr-claw transcribe --file meeting.wav --lang zh

# Pipe from stdin
cat audio.wav | asr-claw transcribe --lang zh

# Streaming (real-time, from adb-claw or ffmpeg)
adb-claw audio capture --stream --duration 60000 | asr-claw transcribe --stream --lang zh

# Subtitle output
asr-claw transcribe --file lecture.wav --format srt > lecture.srt
asr-claw transcribe --file lecture.wav --format vtt > lecture.vtt

# Specify engine
asr-claw transcribe --file audio.wav --engine whisper --lang en

Flags:

Flag Default Description
--file \x3Cpath> stdin Input audio file
--stream false Streaming mode (real-time)
--lang \x3Ccode> zh Language code
--engine \x3Cname> auto ASR engine
--format \x3Cfmt> json Output: json, text, srt, vtt
--chunk \x3Csec> 0 Fixed-time chunking (disables VAD)
--rate \x3Chz> 16000 Sample rate for raw PCM input

engines — Manage ASR engines

asr-claw engines list                    # List all engines + status
asr-claw engines install qwen-asr       # Install local engine (Mac)
asr-claw engines info qwen-asr          # Engine details
asr-claw engines start qwen3-asr        # Start vLLM service engine
asr-claw engines stop qwen3-asr         # Stop service engine
asr-claw engines status                  # Running engines

doctor — Environment check

asr-claw doctor    # Check platform, engines, dependencies

Engine Matrix

Engine Type Mac GPU Streaming Install
qwen-asr Local CLI Yes No (Accelerate) VAD engines install qwen-asr
qwen3-asr vLLM Service No Yes (CUDA) Native engines start qwen3-asr
whisper Local CLI Yes No VAD Manual
doubao Cloud API Yes No Set DOUBAO_API_KEY
openai Cloud API Yes No Set OPENAI_API_KEY
deepgram Cloud API Yes Native Set DEEPGRAM_API_KEY

Output Format

All commands output JSON envelope:

{
  "ok": true,
  "command": "transcribe",
  "data": {
    "segments": [{"index": 0, "start": 0.0, "end": 2.5, "text": "..."}],
    "full_text": "...",
    "engine": "qwen-asr",
    "audio_duration_sec": 5.5
  },
  "duration_ms": 1230,
  "timestamp": "2026-03-13T10:00:00Z"
}

Use -o text for plain text, -o quiet for silent.

With adb-claw

# Real-time transcription from Android device
adb-claw audio capture --stream --duration 60000 | asr-claw transcribe --stream --lang zh

# Record then transcribe
adb-claw audio capture --duration 30000 --file recording.wav
asr-claw transcribe --file recording.wav --lang zh

# Save audio + transcribe simultaneously
adb-claw audio capture --stream --duration 0 | tee backup.wav | asr-claw transcribe --stream
Usage Guidance
This skill appears to be a legitimate speech-to-text tool, but before installing you should: (1) verify the binary release and checksum on the GitHub repo (prefer a pinned release rather than 'latest'), (2) be prepared for large model downloads (GBs) and possible local builds, (3) only provide cloud API keys (OpenAI/Deepgram/Doubao) if you trust the service and the plugin, (4) avoid pointing model_path to directories with sensitive files, and (5) confirm whether the platform will actually perform the auto-download/installation described in SKILL.md since the registry metadata omitted an install spec. If you need higher assurance, run the plugin in an isolated environment or review the released binary artifact contents first.
Capability Analysis
Type: OpenClaw Skill Name: asr-claw Version: 1.1.1 The asr-claw skill is a legitimate speech-to-text utility providing integration with local models (Qwen-ASR) and cloud APIs (OpenAI, Deepgram). The installation process uses a standard binary download from a GitHub repository (llm-net/asr-claw) with checksum verification, and the requested permissions and environment variables (API keys) are consistent with its stated purpose of audio transcription.
Capability Assessment
Purpose & Capability
The name/description (ASR CLI for transcription) aligns with the commands and engines listed. Requiring an asr-claw binary and supporting local/cloud ASR engines is expected. However, the registry metadata claims 'no install spec' while the SKILL.md includes an install block (download from GitHub releases) and auto-install behavior, which is an inconsistency worth noting.
Instruction Scope
SKILL.md instructs the agent to download/run binaries, install large local models (~1.9–3.4GB), read/write settings in ~/.asr-claw/config.yaml, and use cloud APIs. It also shows examples that rely on OPENAI_API_KEY, DEEPGRAM_API_KEY, DOUBAO_API_KEY and other env vars — but those credentials are not declared in the registry metadata. The instructions allow pointing model_path at arbitrary local directories, which grants the skill access to user files in that path. These behaviors expand the runtime scope beyond what the registry summary stated.
Install Mechanism
SKILL.md specifies a download from GitHub releases (https://github.com/llm-net/asr-claw/releases/latest/download/...), dest bin/asr-claw, and a checksum_url. GitHub releases is a normal source, and a checksum is provided, but the use of the 'latest' release (rather than a pinned version) increases upgrade/attack surface. The registry itself omitted an install spec even though SKILL.md contains one — an inconsistency to verify.
Credentials
Cloud-engine examples require provider API keys (OpenAI, Deepgram, Doubao) which are reasonable for cloud transcription, but the registry lists no required env vars. The SKILL.md therefore expects undeclared credentials. Also the settings allow specifying arbitrary model_path and binary_path locations, which can expose local filesystem contents if misused. The requested access is plausible but not properly declared.
Persistence & Privilege
The skill is not marked 'always:true' and does not request system-wide privileges. It may create files under its own plugin directory and ~/.asr-claw (configs and models) as part of normal operation, which is expected for a local ASR tool.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install asr-claw
  3. After installation, invoke the skill by name or use /asr-claw
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.1.1
Add GitHub Pages deployment with custom domain (asr-claw.llm.net)
v1.1.0
Fix ClawHub install: use kind:download with SHA256 checksums instead of bash script. Add homepage and requires.bins metadata.
v1.0.0
v1.0.0: Official release with pre-built qwen-asr binaries, bilingual website, zero build dependencies for engine install
Metadata
Slug asr-claw
Version 1.1.1
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 3
Frequently Asked Questions

What is Asr Claw?

Speech recognition CLI for AI agent automation. Transcribe audio from stdin, files, or URLs. It is an AI Agent Skill for Claude Code / OpenClaw, with 299 downloads so far.

How do I install Asr Claw?

Run "/install asr-claw" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Asr Claw free?

Yes, Asr Claw is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Asr Claw support?

Asr Claw is cross-platform and runs anywhere OpenClaw / Claude Code is available (darwin, linux).

Who created Asr Claw?

It is built and maintained by 任嘉 (@dionren); the current version is v1.1.1.

💬 Comments