← 返回 Skills 市场
cinience

Aliyun Qwen Asr

作者 cinience · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
97
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install aliyun-qwen-asr
功能描述
Use when transcribing non-realtime speech with Alibaba Cloud Model Studio Qwen ASR models (`qwen3-asr-flash`, `qwen-audio-asr`, `qwen3-asr-flash-filetrans`)....
使用说明 (SKILL.md)

Category: provider

Model Studio Qwen ASR (Non-Realtime)

Validation

mkdir -p output/aliyun-qwen-asr
python -m py_compile skills/ai/audio/aliyun-qwen-asr/scripts/transcribe_audio.py && echo "py_compile_ok" > output/aliyun-qwen-asr/validate.txt

Pass criteria: command exits 0 and output/aliyun-qwen-asr/validate.txt is generated.

Output And Evidence

  • Store transcripts and API responses under output/aliyun-qwen-asr/.
  • Keep one command log or sample response per run.

Use Qwen ASR for recorded audio transcription (non-realtime), including short audio sync calls and long audio async jobs.

Critical model names

Use one of these exact model strings:

  • qwen3-asr-flash
  • qwen3-asr-flash-2026-02-10
  • qwen-audio-asr
  • qwen3-asr-flash-filetrans
  • qwen3-asr-flash-filetrans-2025-11-17

Selection guidance:

  • Use qwen3-asr-flash, qwen3-asr-flash-2026-02-10, or qwen-audio-asr for short/normal recordings (sync).
  • Use qwen3-asr-flash-filetrans or qwen3-asr-flash-filetrans-2025-11-17 for long-file transcription (async task workflow).

Prerequisites

  • Install SDK dependencies (script uses Python stdlib only):
python3 -m venv .venv
. .venv/bin/activate
  • Set DASHSCOPE_API_KEY in environment, or add dashscope_api_key to ~/.alibabacloud/credentials.

Normalized interface (asr.transcribe)

Request

  • audio (string, required): public URL or local file path.
  • model (string, optional): default qwen3-asr-flash.
  • language_hints (array\x3Cstring>, optional): e.g. zh, en.
  • sample_rate (number, optional)
  • vocabulary_id (string, optional)
  • disfluency_removal_enabled (bool, optional)
  • timestamp_granularities (array\x3Cstring>, optional): e.g. sentence.
  • async (bool, optional): default false for sync models, true for qwen3-asr-flash-filetrans.

Response

  • text (string): normalized transcript text.
  • task_id (string, optional): present for async submission.
  • status (string): SUCCEEDED or submission status.
  • raw (object): original API response.

Quick start (official HTTP API)

Sync transcription (OpenAI-compatible protocol):

curl -sS --location 'https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions' \
  --header "Authorization: Bearer $DASHSCOPE_API_KEY" \
  --header 'Content-Type: application/json' \
  --data '{
    "model": "qwen3-asr-flash",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "input_audio",
            "input_audio": {
              "data": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"
            }
          }
        ]
      }
    ],
    "stream": false,
    "asr_options": {
      "enable_itn": false
    }
  }'

Async long-file transcription (DashScope protocol):

curl -sS --location 'https://dashscope.aliyuncs.com/api/v1/services/audio/asr/transcription' \
  --header "Authorization: Bearer $DASHSCOPE_API_KEY" \
  --header 'X-DashScope-Async: enable' \
  --header 'Content-Type: application/json' \
  --data '{
    "model": "qwen3-asr-flash-filetrans",
    "input": {
      "file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"
    }
  }'

Poll task result:

curl -sS --location "https://dashscope.aliyuncs.com/api/v1/tasks/\x3Ctask_id>" \
  --header "Authorization: Bearer $DASHSCOPE_API_KEY"

Local helper script

Use the bundled script for URL/local-file input and optional async polling:

python skills/ai/audio/aliyun-qwen-asr/scripts/transcribe_audio.py \
  --audio "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" \
  --model qwen3-asr-flash \
  --language-hints zh,en \
  --print-response

Long-file mode:

python skills/ai/audio/aliyun-qwen-asr/scripts/transcribe_audio.py \
  --audio "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" \
  --model qwen3-asr-flash-filetrans \
  --async \
  --wait

Operational guidance

  • For local files, use input_audio.data (data URI) when direct URL is unavailable.
  • Keep language_hints minimal to reduce recognition ambiguity.
  • For async tasks, use 5-20s polling interval with max retry guard.
  • Save normalized outputs under output/aliyun-qwen-asr/transcripts/.

Output location

  • Default output: output/aliyun-qwen-asr/transcripts/
  • Override base dir with OUTPUT_DIR.

Workflow

  1. Confirm user intent, region, identifiers, and whether the operation is read-only or mutating.
  2. Run one minimal read-only query first to verify connectivity and permissions.
  3. Execute the target operation with explicit parameters and bounded scope.
  4. Verify results and save output/evidence files.

References

  • references/api_reference.md
  • references/sources.md
  • Realtime synthesis is provided by skills/ai/audio/aliyun-qwen-tts-realtime/.
安全使用建议
This skill mostly does what it says (transcribe audio via Alibaba Cloud Qwen ASR), but there are a few red flags you should consider before installing or running it: - Missing declared credential: The registry metadata does not list DASHSCOPE_API_KEY (or a primary credential), yet both SKILL.md and the script require it or a ~/.alibabacloud/credentials entry. Treat this as an omission and ensure you supply only a scoped API key. - .env and repo-dotenv loading: The script will load .env from the current working directory and from a repository root discovered by searching upward for a .git directory, and will populate environment variables for any key=value lines. That can inadvertently read other secrets (database passwords, other API keys). Before running, inspect any .env files in the project and your repo root, or run in an isolated environment with controlled .env. - Review ~/.alibabacloud/credentials: The script will read this file to extract dashscope_api_key. If you keep multiple credentials or sensitive tokens in that file, consider creating a dedicated profile with only the ASR key. - Run in a disposable virtualenv/container: Use the suggested venv and run validation (the provided py_compile check) in an isolated environment first. Consider running on a machine that does not contain unrelated secrets. - Audit the code: The Python helper is short and straightforward; scan it yourself (or have someone you trust do so) before trusting it with private audio. - Operational precautions: Use a least-privilege DASHSCOPE_API_KEY, set OUTPUT_DIR to a safe location, and avoid running the script from repositories that have sensitive .env files unless you explicitly control them. Given these inconsistencies (metadata vs runtime behavior) I mark the skill as suspicious rather than benign. If the author updates the metadata to declare required env vars and documents the .env loading behavior, and you confirm the script only reads intended files, the risk would be reduced.
功能分析
Type: OpenClaw Skill Name: aliyun-qwen-asr Version: 1.0.0 The skill provides a legitimate integration for Alibaba Cloud's Qwen ASR (speech-to-text) service. The primary script, `scripts/transcribe_audio.py`, uses the Python standard library to interact with official Alibaba Cloud endpoints (dashscope.aliyuncs.com) and correctly handles credentials via environment variables or the standard `~/.alibabacloud/credentials` file. No evidence of data exfiltration, malicious execution, or harmful prompt injection was found in the code or documentation.
能力评估
Purpose & Capability
The name, description, endpoints, and bundled script all align with a non-realtime Alibaba Cloud Qwen ASR transcription skill (sync and async flows). However, the skill metadata declares no required environment variables or primary credential even though the SKILL.md and script require a DASHSCOPE_API_KEY (or credentials file). This omission is an inconsistency.
Instruction Scope
The SKILL.md and the script instruct the agent to read/save files under output/aliyun-qwen-asr and to use DASHSCOPE_API_KEY or ~/.alibabacloud/credentials. The script additionally loads .env from the current working directory and from a repo root discovered by searching parent directories for a .git folder, and it will inject any key=value pairs into the process environment if not already present. That behavior is broader than the SKILL.md explicitly documents and could cause unrelated local secrets to be read into the environment.
Install Mechanism
This is an instruction-only skill with a Python helper script and no install spec. No external archives or installers are fetched by the skill itself, which keeps install risk low.
Credentials
Although asking for a DashScope/Alibaba API key is appropriate for the stated purpose, the skill's registry metadata does not declare DASHSCOPE_API_KEY or the local credentials path as required. The script also reads arbitrary .env files and will set environment variables from them; that expands the effective scope of secrets accessed beyond the single API key. The code also honors ALIBABA_CLOUD_PROFILE/ALICLOUD_PROFILE environment variables, which is reasonable, but again not declared in metadata.
Persistence & Privilege
always:false (no forced always-on presence). The skill does not request to modify other skills or system-wide agent configs. It writes outputs to an output/ directory (documented).
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install aliyun-qwen-asr
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /aliyun-qwen-asr 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release of aliyun-qwen-asr for non-realtime audio transcription: - Supports Alibaba Cloud Qwen ASR models for transcribing audio files to text, including transcript generation with timestamps. - Compatible with both synchronous (short/normal audio) and asynchronous (long-file) workflows. - Provides a straightforward Python script and curl examples for submitting and polling transcription jobs. - Normalized interface for consistent request and response handling across different model modes. - Clear operational and validation guidance, including output storage conventions and polling recommendations.
元数据
Slug aliyun-qwen-asr
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

Aliyun Qwen Asr 是什么?

Use when transcribing non-realtime speech with Alibaba Cloud Model Studio Qwen ASR models (`qwen3-asr-flash`, `qwen-audio-asr`, `qwen3-asr-flash-filetrans`).... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 97 次。

如何安装 Aliyun Qwen Asr?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install aliyun-qwen-asr」即可一键安装,无需额外配置。

Aliyun Qwen Asr 是免费的吗?

是的,Aliyun Qwen Asr 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Aliyun Qwen Asr 支持哪些平台?

Aliyun Qwen Asr 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Aliyun Qwen Asr?

由 cinience(@cinience)开发并维护,当前版本 v1.0.0。

💬 留言讨论