← Back to Skills Marketplace
447992399

Llava Vision

by Jh-server · GitHub ↗ · v1.0.1 · MIT-0
linuxdarwinwin32 ⚠ suspicious
84
Downloads
0
Stars
1
Active Installs
2
Versions
Install in OpenClaw
/install llava-vision
Description
Call a local llama.cpp server with the LLaVA model to analyze images.
README (SKILL.md)

LLaVA Vision Skill

This skill forwards an image to a locally running llama.cpp server that hosts a LLaVA model and returns the model’s text description of the image. It accepts either a local file path or a remote image URL.

Usage

clawhub llava-vision --image /path/to/photo.jpg
# or
clawhub llava-vision --image https://example.com/photo.jpg

The skill uses the built‑in vision_analyze tool, which expects an image file path. If the image cannot be read or the server is unreachable, an error message will be returned.

Dependencies

  • Node.js (the skill itself)
  • A local llama.cpp server with the LLaVA model exposed at the default endpoint.

Example

$ clawhub run llava-vision --image ./cat.png
The image contains a cat sitting on a windowsill, looking out at a sunny garden.
Usage Guidance
This skill appears to do what it says: read an image (local path or remote URL), encode it, and POST it to a local llama.cpp/LLaVA server at 127.0.0.1:8081. Before installing or using: (1) ensure you trust the local server (it will receive the raw image bytes); (2) avoid passing paths or URLs that contain sensitive data (the skill will read local files and fetch arbitrary URLs you give it); (3) be aware the code will fetch remote images from any URL you provide — do not pass internal-management URLs you don't want accessed; (4) note a minor interoperability issue: index.js uses require('./tool') while tool.js uses ES export syntax (this may fail on some Node setups). If you need higher assurance, review or run the code locally in an isolated environment and confirm the local llama server is secure and not forwarding model inputs externally.
Capability Analysis
Type: OpenClaw Skill Name: llava-vision Version: 1.0.1 The skill is designed to analyze images by sending them to a local llama.cpp server (127.0.0.1:8081). The code in `tool.js` and `index.js` correctly implements the described functionality, allowing users to provide either a local file path or a remote URL. It uses standard Node.js modules and lacks any indicators of malicious intent, such as data exfiltration to external domains, obfuscation, or unauthorized command execution.
Capability Assessment
Purpose & Capability
The skill's description says it will call a local llama.cpp server hosting LLaVA; tool.js posts an image (as a data: URI) to http://127.0.0.1:8081/v1/chat/completions using model 'llava'. The ability to accept a local file path or remote image URL matches the SKILL.md.
Instruction Scope
Instructions and code stay within the skill's stated scope: they read a local image file or fetch a remote image URL, base64-encode it, and POST it to the local server. Two practical notes: (1) the skill will perform arbitrary outbound fetch() requests when given remote image URLs (so it can connect to any URL you pass), and (2) it will read any local path you provide — so passing sensitive paths will expose their contents to the local LLaVA server. The skill itself does not forward data to other remote endpoints, but the local server could—verify that local server's behavior/trustworthiness.
Install Mechanism
There is no install spec; the package is instruction/code-only (no downloads or external installers). This is low-risk from an install mechanism perspective.
Credentials
The skill requests no environment variables, no credentials, and no config paths. That is proportional to its purpose.
Persistence & Privilege
The skill is not forced-always, is user-invocable, and does not attempt to modify other skills or system-wide config. It does not request elevated persistence privileges.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install llava-vision
  3. After installation, invoke the skill by name or use /llava-vision
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.1
- Skill name changed from "llava-vision" to "llava-vision-local" - No other functional or usage changes documented
v1.0.0
Initial release of llava-vision skill. - Sends local image files or remote image URLs to a local llama.cpp server running a LLaVA model for image analysis. - Returns a text description of the image as provided by the model. - Supports Linux, macOS, and Windows platforms. - Requires a locally running llama.cpp server with the LLaVA model and Node.js.
Metadata
Slug llava-vision
Version 1.0.1
License MIT-0
All-time Installs 1
Active Installs 1
Total Versions 2
Frequently Asked Questions

What is Llava Vision?

Call a local llama.cpp server with the LLaVA model to analyze images. It is an AI Agent Skill for Claude Code / OpenClaw, with 84 downloads so far.

How do I install Llava Vision?

Run "/install llava-vision" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Llava Vision free?

Yes, Llava Vision is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Llava Vision support?

Llava Vision is cross-platform and runs anywhere OpenClaw / Claude Code is available (linux, darwin, win32).

Who created Llava Vision?

It is built and maintained by Jh-server (@447992399); the current version is v1.0.1.

💬 Comments