← Back to Skills Marketplace
juan-oy

Local speech to text Qwen3-ASR w/ OpenVINO (no API key)

by juan-OY · GitHub ↗ · v1.0.3 · MIT-0
cross-platform ⚠ suspicious
202
Downloads
0
Stars
0
Active Installs
4
Versions
Install in OpenClaw
/install local-qwen3-asr-aipc
Description
Local offline ASR on Windows — no cloud, no API cost, full privacy. Qwen3-ASR 0.6B + Intel OpenVINO, GPU-accelerated inference. NETWORK: required for first-t...
README (SKILL.md)

\r \r

Local Speech Recognition (Windows · Qwen3-ASR · OpenVINO)\r

\r Model: snake7gun/Qwen3-ASR-0.6B-fp16-ov (ModelScope FP16) \r SKILL_VERSION: 'v1.0.3'\r \r

First time? Before using this skill, run these two scripts once in a terminal:\r

python setup.py          # [SYSTEM PYTHON OK] creates venv, installs deps (~5 min)\r
python download_model.py # [SYSTEM PYTHON OK] downloads the model (~2 GB, resumable)\r
```\r
Both scripts are in the skill directory alongside this SKILL.md.\r

\r ---\r \r

Agent Routing\r

\r Always use acoustic_pipeline.py as the entry point, called with VENV_PY (obtained from check_env.py output). It handles all cases:\r \r

# VENV_PY = value from check_env.py output, e.g. C:\intel_openvino\venv\Scripts\python.exe\r
\r
# Single file\r
& "\x3CVENV_PY>" "\x3Cskill_dir>\acoustic_pipeline.py" --file "\x3CFILE_PATH>" --language auto\r
\r
# Single file + save transcript\r
& "\x3CVENV_PY>" "\x3Cskill_dir>\acoustic_pipeline.py" --file "\x3CFILE_PATH>" --language auto --archive json\r
\r
# Watch folder\r
& "\x3CVENV_PY>" "\x3Cskill_dir>\acoustic_pipeline.py" --watch "\x3CDIR_PATH>" --language auto --archive both\r
\r
# Batch folder\r
& "\x3CVENV_PY>" "\x3Cskill_dir>\acoustic_pipeline.py" --batch "\x3CDIR_PATH>" --language auto --archive json\r
```\r
\r
> **Never run `acoustic_pipeline.py` with system `python`.** It imports model packages (`openvino`, `qwen_asr`) that are only installed in the venv.\r
\r
Use `transcribe.py` directly only when called internally by `acoustic_pipeline.py` — do not invoke it as a standalone entry point.\r
\r
---\r
\r
## Skill Contract (Input / Output)\r
\r
### Accepted Inputs\r
\r
Any agent should treat this skill as a local audio/video transcription skill.\r
\r
1. Single file path\r
* Audio: .wav, .mp3, .flac, .m4a, .ogg, .aac, .wma, .opus\r
* Video: .mp4, .mkv, .webm, .flv, .mov, .avi, .mts, .m2ts, .ts, .m3u8\r
2. Folder path\r
* Watch mode: continuously process new files in folder\r
* Batch mode: process existing files in folder recursively\r
3. Runtime options\r
* language: auto or explicit language name\r
* archive: none | txt | json | both\r
* archive_dir: optional output folder for transcript files\r
* auto_bootstrap: initialize ASR automatically when environment is missing\r
\r
### Output On Success\r
\r
The result should be returned as a JSON object (or equivalent dictionary) with:\r
\r
* text: transcription content\r
* language: detected or requested language\r
* source_file: original input path\r
* source_format: source extension\r
* confidence: optional confidence value (if available)\r
* archive_files: optional object containing txt/json output paths\r
\r
Example shape:\r
\r
```json\r
{\r
    "text": "...",\r
    "language": "Chinese",\r
    "source_file": "C:\\demo\\meeting.mp4",\r
    "source_format": ".mp4",\r
    "confidence": null,\r
    "archive_files": {\r
        "json": "C:\\demo\	ranscripts\\meeting_20260326_120000.json"\r
    }\r
}\r
```\r
\r
### Output On Failure\r
\r
The agent should return a short structured error summary including:\r
\r
* error: human-readable failure reason\r
* stage: bootstrap | extract_audio | transcribe | archive\r
* source_file: input path (if known)\r
* recoverable: true if retry is reasonable\r
\r
---\r
\r
## ⚠️ Agent instructions\r
\r
1. **Windows / PowerShell only.** Never use Linux commands (`ls`, `rm`, `cat`). Never use `&&` or `call`.\r
2. **Every step reads `state.json` itself** — do not pass paths between steps manually.\r
3. **Use `VENV_PY` from state.json for inference calls** — never use system python for inference. `check_env.py` and `setup.py` are the only scripts intentionally run with system python (they create or validate the venv, so they must not depend on it). `check_env.py` validates the *venv's* packages, not system Python's — `PACKAGES_MISSING` from check_env.py means the venv is incomplete, not the system installation.\r
4. `transcribe.py` is automatically deployed to `ASR_DIR` when `setup.py` runs. If it is missing, re-run `setup.py`.\r
5. **If `transcribe.py` fails at runtime**, do NOT edit it manually. Re-run `setup.py` to redeploy the original from the repository.\r
6. **Never diagnose "model issues" without running Step 1's check script first.** The model layout may use a `thinker/` subdirectory — the scripts handle this automatically.\r
7. **Do not generate helper scripts in agent workspace directories.** Use repository-managed `check_env.py` instead.\r
8. **Goal**: transcribe the audio file and send the result to the conversation.\r
\r
**Auto-recovery policy — try before asking user:**\r
⚠️ Network handling (proxy-aware)\r
\r
When running setup.py or download_model.py:\r
\r
1. Try to read system proxy settings:\r
\r
   * Environment variables: HTTP_PROXY / HTTPS_PROXY\r
   * Windows WinHTTP proxy (netsh winhttp show proxy)\r
2. If proxy is detected:\r
\r
   * Automatically apply it\r
   * Continue installation / download\r
3. If no proxy is detected:\r
\r
   * Continue normally (direct connection)\r
4. If download/setup fails:\r
\r
   * Inform the user whether a proxy was detected\r
   * If no proxy was detected, suggest configuring one\r
   * Then retry\r
IMPORTANT:\r
* Many networks (corporate / China mainland / campus) require proxy\r
* Download supports resume — safe to retry after fixing network\r
* If `STATE=MISSING`, `VENV_PY=BROKEN`, or `PACKAGES_MISSING`: automatically run `setup.py` (up to 3 attempts). Only ask user to run it manually if all 3 attempts fail. `setup.py` is idempotent — it skips any step already completed (venv, packages, repo clone, asr_engine.py, transcribe.py). Re-running a working environment is safe and fast.\r
* If `MODEL_STATUS=MISSING`: automatically run `download_model.py` (up to 3 attempts). Only ask user to run it manually if all 3 attempts fail or if a single attempt runs longer than 8 minutes without completing (likely a slow connection — download_model.py supports resume so partial progress is not lost). `download_model.py` exits immediately if the model is already complete — re-running the script on an already-downloaded model is safe.\r
* Always announce what you are doing before each attempt: "⚙️ Automatically setting up the environment (attempt N/3)…"\r
\r
\r
## Pre-flight: Verify Runtime Requirements (Required on First Use)\r
\r
> 🔍 Pre-flight: Checking Python and git...\r
\r
### Check Python Version\r
\r
```powershell\r
python --version\r
```\r
\r
**Interpretation:**\r
\r
| Output | Action |\r
|--------|--------|\r
| `Python 3.10.x` or higher | `PYTHON_OK` — set `PYTHON_EXE=python`, continue to git check |\r
| `Python 3.8 / 3.9` | Version too low; upgrade required (see below) |\r
| `'python' is not recognized...` | Python missing; install required (see below) |\r
| `Python was not found; run without arguments...` | **Windows Store alias** — run `py --version` instead (see below) |\r
\r
**If output contains "run without arguments to install from the Microsoft Store"**, the Windows Store App Execution Alias is shadowing the real Python. Do NOT ask the user to change settings, and do NOT write helper scripts. Use `where.exe` (native Windows tool, works in both PowerShell and cmd) to find the real Python:\r
\r
> ⚠️ **Do NOT write any helper .ps1 scripts. Run the command below directly in the terminal.**\r
\r
```powershell\r
where.exe python 2>$null | Where-Object { $_ -notlike "*WindowsApps*" } | Select-Object -First 1\r
```\r
\r
* A path is printed → **record this literal string as `SYSTEM_PYTHON`** (this is the system-level Python, used only to run `setup.py`, `check_env.py`, and `download_model.py`). For every command marked `[SYSTEM PYTHON]`, substitute the full literal path for `python`. Example: `python "\x3Cskill_dir>\check_env.py"` becomes `"C:\Users\intel\AppData\Local\Programs\Python\Python312\python.exe" "\x3Cskill_dir>\check_env.py"`. Do NOT use this path for inference — inference must always use `VENV_PY` (the venv Python path printed by `check_env.py`). Do NOT rely on a `$variable` across tool calls — each call is a new shell process; always embed the literal path directly.\r
* Nothing printed → Python is not installed — install it (see below).\r
**If Python is missing or outdated**, run this one-command silent installer in PowerShell (recommended, no admin required):\r
\r
**```powershell\r
$f = "$env:TEMP\\python-installer.exe"\r
Invoke-WebRequest "https://www.python.org/ftp/python/3.12.10/python-3.12.10-amd64.exe" -OutFile $f\r
Start-Process $f -ArgumentList "/quiet InstallAllUsers=0 PrependPath=1 Include_pip=1" -Wait\r
Remove-Item $f\r
```\r
\r
> `PrependPath=1` adds Python to PATH automatically; `Include_pip=1` installs pip; `InstallAllUsers=0` avoids requiring administrator privileges.\r
\r
After installation, **restart the terminal**, then run `python --version` and confirm it reports `Python 3.12.x`.\r
\r
If you prefer manual installation: download **https://www.python.org/ftp/python/3.12.10/python-3.12.10-amd64.exe** and make sure to check **"Add python.exe to PATH"** during setup.\r
\r
### Check git\r
\r
```bat\r
git --version\r
```\r
\r
**Interpretation:**\r
\r
**| Output | Action |**\r
**|------|------|**\r
| `git version 2.x.x` | ✅ `GIT_OK`, Pre-flight passed |\r
| `'git' is not recognized as an internal or external command` |  git is not installed; install is required (see below) |\r
\r
**If git is missing**, run this one-command silent installer in PowerShell:\r
\r
```powershell\r
$f = "$env:TEMP\\git-installer.exe"\r
Invoke-WebRequest "https://github.com/git-for-windows/git/releases/download/v2.49.0.windows.1/Git-2.49.0-64-bit.exe" -OutFile $f\r
Start-Process $f -ArgumentList "/VERYSILENT /NORESTART /NOCANCEL /SP- /CLOSEAPPLICATIONS /RESTARTAPPLICATIONS /COMPONENTS=icons,ext\\reg\\shellhere,assoc,assoc_sh" -Wait\r
Remove-Item $f\r
```\r
\r
After installation, **restart the terminal**, then run `git --version` to confirm.\r
\r
If you prefer manual installation: open **https://git-scm.com/download/win**, download the installer, and proceed with default options.\r
\r
> git is required for the `git+https://` dependency in `requirements_imagegen.txt`; without git, `pip install` will fail with `git: command not found`.\r
\r
**Pre-flight pass criteria**: `python --version` is >= 3.10 and `git --version` returns a valid version string.\r
\r
Status message: `✅ Python and git are ready. Starting main workflow.`\r
\r
**Pipeline — follow exactly in order, no skipping:**\r
\r
```\r
Step 0: parse request       → AUDIO_PATH, LANGUAGE, TOPIC\r
Step 1: verify environment  → run check_env.py → record VENV_PY and ASR_DIR\r
         ↳ if STATE=MISSING or VENV_BROKEN or PACKAGES_MISSING: auto-run setup.py (3 attempts)\r
         ↳ if SCRIPTS_STALE=...: auto-run setup.py to redeploy runtime scripts\r
         ↳ if MODEL_STATUS=MISSING: auto-run download_model.py (3 attempts)\r
Step 2: transcribe + send   → run acoustic_pipeline.py using VENV_PY from Step 1\r
         ↳ fallback: run transcribe.py directly using VENV_PY and ASR_DIR from Step 1\r
```\r
\r
\---\r
\r
## Step 0: parse request (LLM only — no tools)\r
\r
Extract from the user's message:\r
\r
|Field|Default|Notes|\r
|-|-|-|\r
|`AUDIO_PATH`|required|Absolute path to audio/video file (wav/mp3/flac/m4a/ogg/aac/wma/opus/mp4/mkv/webm/flv/mov/avi/mts/m2ts/ts/m3u8)|\r
|`LANGUAGE`|auto-detect|Optional: `Chinese`, `English`, `Japanese`, etc.|\r
|`TOPIC`|English snake_case from context|Used for output filename|\r
\r
If no audio file provided, ask the user before continuing.\r
\r
---\r
\r
## Step 1: verify environment and model\r
\r
> Step 1/3: checking environment and model...\r
\r
```powershell\r
# [SYSTEM PYTHON] check_env.py creates/validates the venv — must NOT use venv python here\r
python "\x3Cskill_dir>\check_env.py"\r
```\r
\r
> `check_env.py` and `setup.py` intentionally run with system Python — they create and validate the venv, so they cannot depend on it.\r
\r
**On success**: record `VENV_PY` and `ASR_DIR` from output, proceed to Step 2.\r
\r
> `check_env.py` also prints `SCRIPTS_STALE=\x3Cold>->\x3Cnew>` when the deployed `transcribe.py` is outdated.\r
> If this line appears, treat it as a mandatory **auto-update** — run `setup.py` before Step 2 (see below).\r
\r
**On failure — auto-recovery (try before asking user):**\r
\r
### If SCRIPTS_STALE=... → auto-run setup.py to redeploy runtime scripts\r
\r
`SCRIPTS_STALE` means the venv and model are both OK, but the deployed `transcribe.py` (and/or `asr_engine.py`) in `ASR_DIR` is an older version. `setup.py` is idempotent — it skips the venv, package, and model steps and only redeploys the outdated files.\r
\r
```\r
⚙️ Runtime scripts are outdated. Redeploying (attempt 1/3)...\r
```\r
\r
```powershell\r
# [SYSTEM PYTHON] — redeploys transcribe.py and asr_engine.py to ASR_DIR\r
python "\x3Cskill_dir>\setup.py"\r
```\r
After running, re-run `check_env.py` to confirm `SCRIPTS_STALE` no longer appears, then proceed to Step 2.\r
\r
### If STATE=MISSING or VENV_PY=BROKEN or PACKAGES_MISSING → auto-run setup.py\r
\r
`PACKAGES_MISSING` means the venv exists but required packages (e.g. `openvino`, `qwen_asr`) are not installed. Re-running `setup.py` re-installs only what is missing; it does not recreate the venv or re-clone the repo.\r
\r
Announce and run (up to 3 attempts):\r
\r
```\r
⚙️ Environment is not initialized. Running automatic setup (attempt 1/3)...\r
```\r
\r
```powershell\r
# [SYSTEM PYTHON] setup.py creates the venv — must NOT use venv python here\r
python "\x3Cskill_dir>\setup.py"\r
```\r
After each attempt, re-run `check_env.py` to verify. If all 3 attempts fail, show manual fallback below.\r
\r
### If MODEL_STATUS=MISSING → auto-run download_model.py\r
\r
Announce and run (up to 3 attempts, stop if a single attempt exceeds 8 minutes):\r
\r
```\r
📥 Model not found. Starting automatic download (attempt 1/3)...\r
   Estimated time: ~3 minutes at 100 Mbps, ~5 minutes at 50 Mbps\r
   Download supports resume; rerun safely after interruption.\r
```\r
\r
```powershell\r
# [SYSTEM PYTHON] download_model.py runs before venv — must NOT use venv python here\r
python "\x3Cskill_dir>\download_model.py"\r
```\r
After each attempt, re-run `check_env.py` to verify. If all 3 attempts fail, show manual fallback below.\r
\r
### Manual fallback (only if all 3 auto-attempts fail)\r
\r
Show user this message:\r
\r
```\r
⚠️ Automatic setup failed. Manual steps are required.\r
\r
Open a Windows terminal (PowerShell or Command Prompt) and run the following in order:\r
\r
1) Install environment (if not installed yet):\r
   python "\x3Cskill_dir>\setup.py"          # [SYSTEM PYTHON OK]\r
   Expected duration: about 5 minutes, fully automated.\r
\r
2) Download model (about 2 GB):\r
   python "\x3Cskill_dir>\download_model.py" # [SYSTEM PYTHON OK]\r
   Download supports resume; rerun safely after interruption.\r
   Estimated time: ~3 minutes at 100 Mbps, ~5 minutes at 50 Mbps.\r
\r
After completion, return here and resend your request.\r
```\r
\r
---\r
\r
## Step 2: transcribe and send result\r
\r
> Step 2/2: transcribing...\r
\r
**Preferred path** — run `acoustic_pipeline.py` with the **venv Python** obtained from Step 1:\r
\r
```powershell\r
# [VENV PYTHON] VENV_PY = value printed by check_env.py, e.g. C:\intel_openvino\venv\Scripts\python.exe\r
# Never use system python here — openvino/qwen_asr are only in the venv\r
& "\x3CVENV_PY>" "\x3Cskill_dir>\acoustic_pipeline.py" --file "AUDIO_PATH" --language auto --archive json\r
```\r
\r
If the user specified a language:\r
\r
```powershell\r
# [VENV PYTHON]\r
& "\x3CVENV_PY>" "\x3Cskill_dir>\acoustic_pipeline.py" --file "AUDIO_PATH" --language "LANGUAGE" --archive json\r
```\r
\r
Return the JSON result directly to the conversation and include any `archive_files` paths.\r
\r
**Fallback** — only if `acoustic_pipeline.py` exits with a non-zero code:\r
\r
```powershell\r
# [VENV PYTHON] fallback — both VENV_PY and ASR_DIR come from check_env.py (Step 1)\r
& "\x3CVENV_PY>" "\x3CASR_DIR>	ranscribe.py" --audio "AUDIO_PATH" --language "LANGUAGE"\r
```\r
\r
> Never use system python here.\r
\r
**Pass**: `$LASTEXITCODE -eq 0`. Stdout is a single line of JSON with this shape:\r
\r
```json\r
{"text": "...", "language": "Chinese", "time_elapsed": 12.3, "audio_path": "C:\\audio\\file.wav"}\r
```\r
\r
Parse with `$result = $stdout | ConvertFrom-Json`. Record `TRANSCRIPT` from `$result.text` and `LANG` from `$result.language`.\r
\r
**Fail**: if `$LASTEXITCODE -ne 0`, do not attempt to parse stdout. Show the stderr output to the user.\r
\r
Send via `message` tool:\r
\r
```\r
action: "send"  message: "✅ LANG\
\
TRANSCRIPT"\r
```\r
\r
---\r
\r
## Troubleshooting\r
\r
|Error|Fix|\r
|-|-|\r
|`Python was not found; run without arguments...`|Windows Store alias blocking `python`. Run: `where.exe python 2>$null | Where-Object { $_ -notlike "*WindowsApps*" } | Select-Object -First 1`. Record the printed literal path and substitute it for `python` in every subsequent command (do NOT use a `$variable` — each tool call is a new shell).|\r
|`STATE=MISSING`|Run `python "\x3Cskill_dir>\setup.py"`|\r
|`VENV_PY=BROKEN`|Re-run `python "\x3Cskill_dir>\setup.py"` — it will rebuild the venv|\r
|`PACKAGES_MISSING: ...`|Re-run `python "\x3Cskill_dir>\setup.py"` — re-installs missing venv packages only; skips steps already done|\r
|`MODEL_STATUS=MISSING`|Run `python "\x3Cskill_dir>\download_model.py"` — exits immediately if model is already complete|\r
|`[ERROR] Audio not found`|Verify the file path is correct and the file exists|\r
|`[ERROR] Model incomplete`|Re-run `python "\x3Cskill_dir>\download_model.py"` — supports resume|\r
|`[ERROR] state.json not found`|Re-run Step 1 (`check_env.py`)|\r
|`SCRIPTS_STALE=v1.0.1->v1.0.2`|Deployed runtime scripts are outdated. Run `python "\x3Cskill_dir>\setup.py"` — it redeploys only the changed files, skipping venv/packages/model (fast).|\r
|`RuntimeError` on GPU|Run `check_env.py` first to confirm model is READY. If model is OK, try adding `--language auto` to remove language mismatch. If still failing, re-run `setup.py` to upgrade OpenVINO and redeploy runtime files.|\r
\r
---\r
\r
## LLM API Usage\r
\r
For agents that prefer a Python import interface over shell commands, run this inside a subprocess using **VENV_PY**:\r
\r
```powershell\r
# Must be run with VENV_PY, not system python\r
& "\x3CVENV_PY>" -c "\r
import sys; sys.path.insert(0, r'\x3Cskill_dir>')\r
from acoustic_pipeline import AcousticPipeline\r
pipeline = AcousticPipeline()\r
result = pipeline.transcribe(r'C:\\meeting.mp4', language='auto', archive_mode='json')\r
print(result['text'])\r
"\r
```\r
\r
Or if your agent framework already runs inside the venv Python process:\r
\r
```python\r
from acoustic_pipeline import AcousticPipeline\r
\r
pipeline = AcousticPipeline()\r
result = pipeline.transcribe("C:\\meeting.mp4", language="auto", archive_mode="json")\r
print(result["text"])\r
print(result.get("archive_files"))\r
```\r
\r
---\r
\r
## Workspace Hygiene\r
\r
* Use repository-managed `check_env.py` — it is versioned, auditable, and repeatable.\r
* If the execution environment forces a temporary file, treat it as disposable and remove it after the command completes.\r
\r
\r
\r
Usage Guidance
This skill appears to do what it says: it will (1) create a shared directory under C:\{USERNAME}_openvino, (2) create a Python venv and pip-install packages (including OpenVINO native libs), (3) clone the Qwen3-ASR repo from GitHub, and (4) download ~2GB of model files from ModelScope. Before installing: ensure you have enough disk space, run setup.py in a terminal you control (PowerShell on Windows), and inspect setup.py/download_model.py if you have concerns. If you want extra safety, run the setup and download steps in an isolated VM or dedicated user account. Note the installer scans all drives to auto-detect previous installs and writes state.json at a top-level user directory — if you prefer a different location, modify the scripts before running. The skill does not request cloud credentials and uses only GitHub/ModelScope for downloads; if you need fully offline setup, the scripts provide manual download hints you can follow instead of allowing network access.
Capability Analysis
Type: OpenClaw Skill Name: local-qwen3-asr-aipc Version: 1.0.3 The skill provides local ASR capabilities but includes high-risk automation for environment setup. Specifically, `SKILL.md` contains instructions for the AI agent to silently download and install Python (from python.org) and Git (from github.com) using PowerShell if they are not detected. Additionally, `setup.py` dynamically generates and writes the inference engine code (`asr_engine.py`) to the filesystem and clones an external repository (`https://github.com/QwenLM/Qwen3-ASR.git`). While these behaviors are aligned with the stated purpose of providing a self-contained local ASR environment, the automated execution of system-level installers and the dynamic dropping of executable files represent a significant attack surface.
Capability Tags
requires-sensitive-credentials
Capability Assessment
Purpose & Capability
Name/description (local Qwen3-ASR + OpenVINO, Windows-only, network only for setup) match the files and scripts: setup.py creates a venv and installs packages, download_model.py downloads the ModelScope model, transcribe.py and acoustic_pipeline.py perform local inference and audio extraction. Required actions (git clone, pip install, modelscope snapshot) are expected for this purpose.
Instruction Scope
Runtime instructions stay within the stated purpose (run setup & download_model once, then use acoustic_pipeline/transcribe for local inference). The scripts do scan all drives for a {USERNAME}_openvino/asr state.json and venv locations, and read/write state.json and deploy files into that directory — this is consistent with auto-discovery but can be invasive (broad filesystem scanning and writing to top-level user-root paths). The skill reads common proxy env vars (HTTP_PROXY/HTTPS_PROXY) for setup retries; no unexpected external endpoints other than GitHub and ModelScope are present.
Install Mechanism
No bundled binary download from unknown hosts; setup uses git clone (GitHub) and pip to install packages, and download_model.py uses modelscope.snapshot_download (ModelScope). These are standard sources for ML models and code. It will install native OpenVINO packages and other dependencies which is expected but can be large.
Credentials
The skill declares no special credentials and the code only reads the USERNAME environment variable and optional proxy variables; it does not request cloud keys or unrelated secrets. That is proportionate for creating a per-user install under C:\{username}_openvino and for making network requests to authorized model hosts.
Persistence & Privilege
The skill creates persistent artifacts: a shared venv, cloned Qwen3-ASR repo, a model directory (~2GB), a state.json in a top-level per-user folder (e.g., C:\{username}_openvino\asr\state.json), and it writes an asr_engine.py in that directory. This is expected for an on-disk local ASR install, but it does require write access to drive roots and adds long-lived files — review and choose install location if you prefer isolation.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install local-qwen3-asr-aipc
  3. After installation, invoke the skill by name or use /local-qwen3-asr-aipc
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.3
Version 1.0.3 - Bumped SKILL_VERSION from 'v1.0.2' to 'v1.0.3'. - Dependency librosa install
v1.0.2
- Removed the file `asr_api.py` and its legacy direct API. - Dropped support for `from asr_api import transcribe_audio`; now the entry point is always `acoustic_pipeline.py`, called via its `AcousticPipeline` interface. - Updated documentation and agent instructions for simplified, more robust routing: always use `acoustic_pipeline.py` via the venv python; never call `transcribe.py` or legacy APIs directly. - Clarified environment verification and auto-recovery policy. - Updated documentation to reflect these changes and ensure correct usage.
v1.0.1
local-qwen3-asr-aipc v1.0.1 - Added acoustic_pipeline.py, enabling unified entry for video/audio, batch, folder watch, and transcript archiving. - Added asr_api.py: Python function API for local transcription integration. - Added check_env.py: standardized script to verify/install environment and model presence. - Added transcribe.py: core local transcription script (deployed by setup.py). - Skill now supports direct video (mp4/mkv/webm/etc.) transcription and batch/folder workflows, with automatic environment/model bootstrapping. - Improved agent routing and instructions; prefer acoustic_pipeline.py for multi-modal/advanced use cases.
v1.0.0
- Initial release of local-qwen3-asr-aipc for Windows, providing fully offline speech-to-text transcription using Qwen3-ASR-0.6B via Intel OpenVINO. - Supports 30 languages and 22 Chinese dialects; accepts WAV, MP3, and FLAC audio files. - Prioritizes Intel iGPU (Xe / Arc) for acceleration, with automatic CPU fallback. - Simple one-time setup: run setup.py and download_model.py. - Includes auto-recovery for setup/model download and proxy detection for network handling. - All inference and transcription are performed locally; no network is needed after setup.
Metadata
Slug local-qwen3-asr-aipc
Version 1.0.3
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 4
Frequently Asked Questions

What is Local speech to text Qwen3-ASR w/ OpenVINO (no API key)?

Local offline ASR on Windows — no cloud, no API cost, full privacy. Qwen3-ASR 0.6B + Intel OpenVINO, GPU-accelerated inference. NETWORK: required for first-t... It is an AI Agent Skill for Claude Code / OpenClaw, with 202 downloads so far.

How do I install Local speech to text Qwen3-ASR w/ OpenVINO (no API key)?

Run "/install local-qwen3-asr-aipc" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Local speech to text Qwen3-ASR w/ OpenVINO (no API key) free?

Yes, Local speech to text Qwen3-ASR w/ OpenVINO (no API key) is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Local speech to text Qwen3-ASR w/ OpenVINO (no API key) support?

Local speech to text Qwen3-ASR w/ OpenVINO (no API key) is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Local speech to text Qwen3-ASR w/ OpenVINO (no API key)?

It is built and maintained by juan-OY (@juan-oy); the current version is v1.0.3.

💬 Comments