← Back to Skills Marketplace
1727
Downloads
0
Stars
4
Active Installs
1
Versions
Install in OpenClaw
/install ms-qwen-vl
Description
调用魔搭社区(ModelScope)Qwen3-VL 多模态 API 进行视觉解析。使用 OpenAI SDK 兼容方式调用,支持图片内容描述、OCR 文字提取、视觉问答、对象检测等功能。用户提到"魔搭"、"ModelScope"、"Qwen-VL"、"多模态视觉"、"解析图片"等关键词时应触发。
Usage Guidance
Key points before installing:
- Metadata inconsistency: the registry claims no required env vars, but the skill and scripts require MODELSCOPE_API_KEY (and optionally model-related env vars). Expect to provide a ModelScope API key. Verify where you will store that key (scripts/.env vs .env mismatch in docs).
- Data exfiltration risk: the script base64-encodes local image files and sends them to https://api-inference.modelscope.cn/v1. If you run this skill (or let an agent run it) and point it at files on your computer, those images will be transmitted to a third-party service. Do not use it on sensitive images unless you trust the ModelScope service and your network.
- Execution behavior: SKILL.md encourages running the bundled Python script to handle local files. If you enable autonomous agent invocation, the agent could execute the script and thereby read local image paths you mention. Consider limiting agent autonomy or running the tool manually in a controlled environment.
- Dependency & path mismatches: the bundle includes requirements.txt (openai, Pillow, python-dotenv) but no automated installer; follow README to install dependencies. The README and SKILL.md reference different .env file paths (root vs scripts/); confirm which path you will use and where the API key is loaded from.
- Verify the endpoint & code: if you require stronger guarantees, review the ms_qwen_vl.py code (it is included) and confirm the base_url and request behavior meet your security and privacy requirements. If needed, run the tool in an isolated sandbox or on non-sensitive sample images first.
Capability Analysis
Type: OpenClaw Skill
Name: ms-qwen-vl
Version: 0.1.0
The skill is classified as suspicious due to its inherent high-risk capabilities, specifically local file read and write access within `scripts/ms_qwen_vl.py`. While these capabilities (reading local image files for analysis and writing results to local files) are plausibly needed for the skill's stated purpose of visual analysis, they present a potential attack surface. There is no clear evidence of intentional malicious behavior such as data exfiltration to unauthorized endpoints, persistence mechanisms, or explicit prompt injection attempts against the OpenClaw agent in `SKILL.md` to perform actions beyond its stated purpose. The API key is handled via environment variables and sent to a legitimate ModelScope API endpoint.
Capability Assessment
Purpose & Capability
Name/description, README, SKILL.md and the Python script all consistently implement a ModelScope (Qwen3-VL) multimodal image analysis skill using an OpenAI-SDK-compatible client and support describe/ocr/ask/detect/chart tasks — the requested capabilities align with the stated purpose.
Instruction Scope
Runtime instructions explicitly tell the agent (and users) to run scripts that read local image files (e.g., Desktop screenshots), encode them as base64, and send them to the remote ModelScope API. This is necessary for the stated functionality, but it is also a direct data-exfiltration vector for any sensitive local images. The SKILL.md examples instruct the assistant to execute local commands, which is expected but increases privacy risk.
Install Mechanism
There is no install spec in the registry (instruction-only), but the bundle includes requirements.txt and a Python script that depends on openai, Pillow, and python-dotenv. Lack of an automated install step is low risk, but users must install dependencies manually; nothing is downloaded from unknown external installers in the manifest.
Credentials
The registry metadata lists no required environment variables or primary credential, but SKILL.md/README and the script require MODELSCOPE_API_KEY (and optionally MODELSCOPE_MODEL / MODELSCOPE_MODEL_PRECISE). This mismatch is an inconsistency that could mislead users about credentials the skill needs. The script will fail without an API key and will send the provided API key to the ModelScope endpoint — so a required secret is present but not declared in metadata.
Persistence & Privilege
Flags show always:false and no special OS restrictions. The skill does not request persistent system-wide privileges, does not modify other skills, and does not require being always-included; no elevated persistence is requested.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install ms-qwen-vl - After installation, invoke the skill by name or use
/ms-qwen-vl - Provide required inputs per the skill's parameter spec and get structured output
Version History
v0.1.0
- Initial release of ms-qwen-vl skill for multi-modal visual analysis via ModelScope Qwen3-VL API.
- Supports image description, OCR text extraction, visual question answering, object detection, and chart analysis.
- Compatible with OpenAI SDK, with sample Python and CLI usage provided.
- Handles both local images (auto-converted to base64) and online image URLs.
- Offers two model modes: fast (30B) and precise (235B).
- Detailed task options and usage instructions included in the documentation.
Metadata
Frequently Asked Questions
What is Ms Qwen Vl?
调用魔搭社区(ModelScope)Qwen3-VL 多模态 API 进行视觉解析。使用 OpenAI SDK 兼容方式调用,支持图片内容描述、OCR 文字提取、视觉问答、对象检测等功能。用户提到"魔搭"、"ModelScope"、"Qwen-VL"、"多模态视觉"、"解析图片"等关键词时应触发。 It is an AI Agent Skill for Claude Code / OpenClaw, with 1727 downloads so far.
How do I install Ms Qwen Vl?
Run "/install ms-qwen-vl" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Ms Qwen Vl free?
Yes, Ms Qwen Vl is completely free (open-source). You can download, install and use it at no cost.
Which platforms does Ms Qwen Vl support?
Ms Qwen Vl is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Ms Qwen Vl?
It is built and maintained by crocketc (@crocketc); the current version is v0.1.0.
More Skills