← 返回 Skills 市场
huruilizhen

Vision Tool

作者 Ruilizhen Hu · GitHub ↗ · v1.1.3 · MIT-0
cross-platform ✓ 安全检测通过
120
总下载
0
收藏
0
当前安装
5
版本数
在 OpenClaw 中安装
/install vision-tool
功能描述
Image recognition using Ollama + qwen3.5:4b with think=False for reliable content extraction.
安全使用建议
This skill appears to do exactly what it claims: it reads a local image file, Base64-encodes it, and POSTs it to an Ollama /api/chat endpoint on localhost. Before installing or running it, ensure you: 1) run a trusted Ollama instance locally (ollama serve) and have pulled qwen3.5:4b, 2) confirm the Ollama service is not proxying/forwarding requests to an untrusted remote endpoint (if you change the default URL the skill will send images to wherever that URL points), and 3) review and run the included tests in a safe environment. Because the skill does not request secrets or remote installs and the code is readable, there are no incoherent or disproportionate requests — but always verify you trust the Ollama server you will use (local vs remote).
功能分析
Type: OpenClaw Skill Name: vision-tool Version: 1.1.3 The vision-tool skill bundle is a legitimate implementation for image recognition using a local Ollama instance. The core logic in `scripts/vision_core.py` uses the `requests` library to communicate with the Ollama API on localhost (127.0.0.1:11434) and handles image data via standard Base64 encoding. No evidence of data exfiltration, unauthorized network calls, or malicious execution was found in `main.py` or the documentation files.
能力评估
Purpose & Capability
Name/description match the implementation: the code reads an image, Base64-encodes it, and posts to an Ollama /api/chat endpoint using model qwen3.5:4b. Required binaries (ollama, python3) are appropriate and no unrelated credentials or tools are requested.
Instruction Scope
Runtime instructions only run local Python code and call the Ollama API at http://127.0.0.1:11434/api/chat; they read the provided image file and send its Base64 payload. This is coherent with image-analysis purpose, but be aware that if the Ollama service URL is changed from the default, image data could be sent to a remote host — the code itself does not exfiltrate to external endpoints by default.
Install Mechanism
No install spec that downloads external artifacts; included code is pure Python using the requests library. There are no archive downloads or external installers declared in the skill metadata.
Credentials
The skill declares no required environment variables or credentials. It uses sensible defaults (local Ollama URL). No secret or cloud credentials are requested, which is proportionate for a local-model vision tool.
Persistence & Privilege
always:false and user-invocable:true (defaults) — the skill does not request forced persistent inclusion or elevated platform privileges. It does not modify other skills or system-wide configs.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install vision-tool
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /vision-tool 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.1.3
- Adds think=False to all API calls for more reliable and clean content extraction. - Updated documentation to reflect improved extraction approach and latest usage details. - Version bump to 1.1.3.
v1.1.2
## vision-tool v1.1.2 - Documentation improvements and minor edits to SKILL.md. - No changes to core code logic or features. - Ensures up-to-date instructions for installation, usage, and troubleshooting.
v1.1.1
- Internal code improvements in vision_core.py - Documentation updated for consistency and clarity - Version bumped to 1.1.1
v1.1.0
Vision Tool v1.1.0 introduces a streamlined approach for image analysis by switching to the Ollama /api/chat endpoint. - Now uses the /api/chat endpoint for direct extraction from the content field, improving output clarity. - Removed complex thinking field processing and unnecessary regex dependencies for simpler, more maintainable code. - Default analysis prompt is now in English: "Describe this image". - Performance guidance and troubleshooting updated for new architecture. - Codebase restructured; core analysis logic is now in scripts/vision_core.py.
v1.0.0
Vision Tool v1.0.0 - Initial release - Provides image recognition using Ollama + qwen3.5:4b - Supports all OpenClaw channels (WeChat, Telegram, Discord, etc.) - Automatically cleans and formats analysis output - Includes full error handling and reporting - Supports both standard and JSON output modes
元数据
Slug vision-tool
版本 1.1.3
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 5
常见问题

Vision Tool 是什么?

Image recognition using Ollama + qwen3.5:4b with think=False for reliable content extraction. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 120 次。

如何安装 Vision Tool?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install vision-tool」即可一键安装,无需额外配置。

Vision Tool 是免费的吗?

是的,Vision Tool 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Vision Tool 支持哪些平台?

Vision Tool 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Vision Tool?

由 Ruilizhen Hu(@huruilizhen)开发并维护,当前版本 v1.1.3。

💬 留言讨论