← Back to Skills Marketplace

qwencloud-vision

Name: qwencloud-vision
Author: cuixiaoyang123

by Cuixiaoyang123 · GitHub ↗ · v0.2.0 · MIT-0

cross-platform ⚠ suspicious

140

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install qwencloud-vision

Description

[QwenCloud] Understand images and videos with Qwen vision models. TRIGGER when: user wants to analyze, describe, or extract information from images or videos...

Usage Guidance

This skill appears to be a normal QwenCloud vision client (OCR, visual reasoning, video analysis) and its bundled Python scripts use only the standard library, but there are a few points to consider before installing or running it: - Registry metadata mismatch: the skill manifest in the registry does not list the required API key(s), yet the SKILL.md and scripts expect DASHSCOPE_API_KEY / QWEN_API_KEY. Treat this as a red flag — confirm with the skill author or source why the registry omitted required credentials. - Credentials: the scripts will look for your DASHSCOPE_API_KEY (or QWEN_API_KEY) in the environment or a .env file. Do not paste your real API key into chat. If you create a .env file via guidance in the docs, put a placeholder first and replace it locally; prefer setting the env var in a secure process. Verify the code never prints raw keys (the skill states it masks keys, but you should review the scripts yourself if you plan to run them). - File and config changes: the skill provides an agent-compatibility helper that can search for and append entries to project/agent config files (e.g., CLAUDE.md, AGENTS.md). It claims to ask before modifying files, but ensure you approve any changes and back up your configs first. - Local file uploads: the scripts can upload local files to provider-managed temp storage (oss://) instead of base64-embedding them. Only enable uploading if you are comfortable sending those files to QwenCloud/DashScope. For sensitive images/videos, use local base64 or avoid uploading. - Run in a controlled environment first: if you are unsure, run the scripts in a sandbox or review the bundled Python files (they are included) before executing. Check any network endpoints used (defaults are DashScope/QwenCloud endpoints like dashscope-intl.aliyuncs.com). If you require tighter assurance, request the skill author to update registry metadata to declare required env vars and document any config file modifications. If you want, I can extract the exact places in the code that require DASHSCOPE_API_KEY/QWEN_API_KEY and list which files would modify or scan user config paths so you can review them before running.

Capability Analysis

Type: OpenClaw Skill Name: qwencloud-vision Version: 0.2.0 The qwencloud-vision skill bundle provides a robust interface for interacting with Qwen vision models for image/video analysis and OCR. The implementation in `qwencloud_lib.py` and `vision_lib.py` uses standard Python libraries to handle API requests, file uploads to temporary storage, and environment variable management. The bundle includes a specialized update-check mechanism (`gossamer.py`) that emits signals to the agent's stderr to suggest installing updates or sibling skills, but the instructions in `SKILL.md` explicitly require user consent before taking action. Security measures are included to prevent the accidental leakage of API keys in plaintext.

Capability Tags

cryptocan-make-purchasesrequires-sensitive-credentials

Capability Assessment

ℹ Purpose & Capability

The name/description (Qwen vision models) match the included scripts and docs: OCR, visual reasoning, streaming, video handling, and use of DashScope/QwenCloud endpoints. However, the registry metadata declares no required environment variables or primary credential while the SKILL.md and Python scripts clearly require an API key (DASHSCOPE_API_KEY / QWEN_API_KEY) and optionally QWEN_BASE_URL/QWEN_REGION. The missing credential declaration in registry metadata is an incoherence.

⚠ Instruction Scope

SKILL.md and the scripts instruct the agent to load .env files, read script and reference files, resolve local file paths and optionally upload local files to provider-managed storage. The agent-compatibility guide includes steps to discover and append markers to project/agent config files (CLAUDE.md, AGENTS.md, etc.) — the doc says to 'Ask the user before modifying any file' but the capability to locate and modify multiple user config locations is present. These actions are within a vision client's plausible scope but are broad and affect user files; the skill also performs environment checks and may upload local files to remote storage, which the user should explicitly authorize.

✓ Install Mechanism

No install spec is included; the skill bundles Python scripts that use the standard library only. There are no download/install steps from external, untrusted URLs in the manifest. Running code from the included scripts will execute on the user's machine (no package install), which is expected for bundled scripts but still requires user trust.

⚠ Credentials

The skill legitimately needs a QwenCloud/DashScope API key (DASHSCOPE_API_KEY or QWEN_API_KEY), and optional QWEN_BASE_URL/QWEN_REGION for custom endpoints — these are proportionate to the stated purpose. However, the skill registry lists no required env vars (metadata mismatch). The scripts also load .env files and will accept writing a placeholder .env; the mismatch between declared registry requirements and actual required credentials is confusing and risky for non-expert users.

ℹ Persistence & Privilege

always:false (normal). The skill includes guidance and code to append markers to agent/project config files to register skills for agents that don't auto-load skills; the docs instruct to ask user before modifying files. This behavior is potent (can alter user configs) but is documented and conditional on user consent — still worth flagging to users.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install qwencloud-vision
After installation, invoke the skill by name or use /qwencloud-vision
Provide required inputs per the skill's parameter spec and get structured output

Version History

v0.2.0

No visible file changes detected in this version. - No changes were made to the skill's files or documentation. - Functionality and usage remain unchanged from the previous version.

v0.1.1

qwencloud-vision v0.1.1 - Updated default and preferred model to `qwen3.6-plus` and added it as the new flagship option. - Revised model selection guide and descriptions to reflect updated model lineup. - Added a note with direct URLs to official model detail pages for more information. - No code or file changes; documentation only.

v0.1.0

Initial release of qwencloud-vision: Qwen Vision Models for advanced image and video understanding. - Supports analysis, description, and extraction of information from images and videos, including OCR, chart/table reading, visual reasoning, multi-image comparison, and video comprehension. - Provides scripts for image/video analysis, visual reasoning (chain-of-thought/streaming), and OCR extraction, compatible with Qwen VL/QVQ models. - Integrated model selection logic with up-to-date model list and clear guidance on standard QwenCloud API key usage. - Detailed, secure guidance for agent setup, API key handling, and script execution without exposing secrets. - Includes comprehensive directory with references for execution guides, prompts, compatibility notes, and script usage. - Python 3.9+ required; no external pip dependencies.

Metadata

Slug qwencloud-vision

Version 0.2.0

License MIT-0

All-time Installs 1

Active Installs 1

Total Versions 3

Frequently Asked Questions

What is qwencloud-vision?

[QwenCloud] Understand images and videos with Qwen vision models. TRIGGER when: user wants to analyze, describe, or extract information from images or videos... It is an AI Agent Skill for Claude Code / OpenClaw, with 140 downloads so far.

How do I install qwencloud-vision?

Run "/install qwencloud-vision" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is qwencloud-vision free?

Yes, qwencloud-vision is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does qwencloud-vision support?

qwencloud-vision is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created qwencloud-vision?

It is built and maintained by Cuixiaoyang123 (@cuixiaoyang123); the current version is v0.2.0.

More Skills