← Back to Skills Marketplace

ollama-vision

Name: ollama-vision
Author: lzm2023

by LZM2023 · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ✓ Security Clean

367

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install ollama-vision

Description

本地调用 Ollama qwen3-vl:4b 模型自动压缩并分析图片，支持描述、OCR 文字提取和自定义信息抽取。

Usage Guidance

This skill appears to do what it claims: compress local images, call a locally running Ollama qwen3-vl:4b model via the local API, and return text results. Before installing or running: 1) Ensure you trust the Ollama build you run locally (the skill posts image data to http://localhost:11434). 2) Be aware that the first use will invoke 'ollama pull qwen3-vl:4b' to download a large model over the network. 3) The included test script reads %USERPROFILE%\.openclaw\media\inbound to find sample images — this is for testing only; review or edit the path if you do not want that directory scanned. 4) The code has a small quality issue: test_skill.py calls analyze_image but the main analysis function is named analyze_image_api (a bug, not malicious). 5) Run check_env.py first and consider running in an isolated environment if you are cautious about large model downloads. If you need higher assurance, verify the Ollama CLI binary and its network behavior independently and review/modify the test script's file-paths before use.

Capability Analysis

Type: OpenClaw Skill Name: ollama-vision Version: 1.0.0 The ollama-vision skill bundle provides a legitimate tool for local image analysis using the Ollama API and the qwen3-vl:4b model. The implementation in analyze_image.py includes robust features like automatic image compression using Pillow and safe subprocess handling for environment checks. The code communicates exclusively with a local service (localhost:11434) and lacks any indicators of data exfiltration, malicious execution, or prompt injection.

Capability Assessment

✓ Purpose & Capability

Name/description (local Ollama vision analysis) match the included code and SKILL.md. The code calls local Ollama, compresses images, and invokes a local API; no unrelated credentials, binaries, or remote endpoints are requested.

ℹ Instruction Scope

SKILL.md and code confine runtime actions to image compression, local Ollama model checks/pulls, and POSTs to localhost:11434. Note: model download (ollama pull) will perform network I/O to fetch ~2–3GB model data. test_skill.py also attempts to read %USERPROFILE%\.openclaw\media\inbound to locate test images — reasonable for local testing but it does access a user directory (only for tests).

✓ Install Mechanism

No install spec; instruction-only skill. The only network activity is via the Ollama CLI (ollama pull) to download the model — expected for a local model-based skill. There are no obscure download URLs or extracted archives in the skill itself.

ℹ Credentials

The skill requests no environment variables or credentials (good). It writes temporary compressed images to the OS temp directory and the test script scans a local inbound media folder for images; this is proportional for its purpose but users should be aware the test script reads a user path.

✓ Persistence & Privilege

Skill does not request 'always: true', does not modify other skills or agent-wide config, and has no special persistence privileges. Autonomous invocation is allowed by default but is not combined with other concerning privileges.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install ollama-vision
After installation, invoke the skill by name or use /ollama-vision
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

使用 Ollama 本地视觉模型进行图像分析，支持 OCR、文字提取、图像描述等功能。默认使用qwen3-vl:4b模型，记得在ollama下载并运行这个模型。具体功能如下：自动压缩：超过 2MB 的图片会自动压缩后再分析多模式分析：describe（描述）、ocr（文字提取）、extract（自定义提取）临时文件清理：压缩产生的临时文件会自动删除质量优先：优先降低 JPEG 质量，必要时缩小尺寸建议：至少要4-6G显存，方便运行模型 Tags: vision, ollama, image-analysis, 本地

Metadata

Slug ollama-vision

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is ollama-vision?

本地调用 Ollama qwen3-vl:4b 模型自动压缩并分析图片，支持描述、OCR 文字提取和自定义信息抽取。 It is an AI Agent Skill for Claude Code / OpenClaw, with 367 downloads so far.

How do I install ollama-vision?

Run "/install ollama-vision" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is ollama-vision free?

Yes, ollama-vision is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does ollama-vision support?

ollama-vision is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created ollama-vision?

It is built and maintained by LZM2023 (@lzm2023); the current version is v1.0.0.

More Skills