← Back to Skills Marketplace

Vision Tagger

Name: Vision Tagger
Author: sagarjhaa

by Sagar Jha · GitHub ↗ · v1.0.0

cross-platform ✓ Security Clean

1368

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install vision-tagger

Description

Tag and annotate images using Apple Vision framework (macOS only). Detects faces, bodies, hands, text (OCR), barcodes, objects, scene labels, and saliency re...

Usage Guidance

This skill appears to do local image analysis using Apple Vision and Pillow; it requires macOS 12+ and Xcode CLI tools. Before installing: (1) only run on macOS as intended (SKILL.md requires macOS); (2) review the included Swift and Python source (they are provided) and only run the compile/install steps if you trust the source; (3) be aware the setup compiles a binary in the skill folder and installs Pillow via pip; (4) run the tool on non-sensitive images first to confirm behavior; and (5) if you want extra caution, run the setup/annotation inside a sandboxed account or VM and inspect the compiled binary with standard tools.

Capability Analysis

Type: OpenClaw Skill Name: vision-tagger Version: 1.0.0 The OpenClaw skill 'vision-tagger' is benign. All files (SKILL.md, setup.sh, annotate_image.py, image_tagger.swift) align with the stated purpose of local image analysis using Apple's Vision framework. The `SKILL.md` and `setup.sh` contain standard commands for installing Xcode CLI tools and Python's Pillow library, and for compiling the Swift binary. The Python script `annotate_image.py` uses `subprocess.run` to execute the Swift binary, but passes arguments as a list, mitigating shell injection risks. The core Swift binary `image_tagger.swift` uses only Apple's Vision and AppKit frameworks for image processing, without any network calls, access to sensitive user data, persistence mechanisms, or obfuscation. There is no evidence of prompt injection or intentional harmful behavior.

Capability Assessment

ℹ Purpose & Capability

The name/description (macOS Apple Vision tagging) match the required binaries (swiftc, python3) and included files (Swift Vision code + Python annotator). Minor inconsistency: registry metadata/flags list no OS restriction while the SKILL.md and scripts explicitly require macOS; this is likely a metadata omission rather than malicious.

✓ Instruction Scope

SKILL.md and scripts instruct compiling a Swift program and running it on an image, then optionally annotating with the Python script. The instructions only reference the image file(s) provided by the user and local system tools; they do not read other system config paths or request additional environment variables.

✓ Install Mechanism

There is no remote download/install of arbitrary code; included files are compiled locally (swiftc) and Python dependencies are installed via pip. The setup script triggers xcode-select --install when swiftc is missing, which is a standard way to get Xcode CLI tools.

✓ Credentials

The skill declares no required environment variables or credentials and the code does not attempt to access secrets or external service tokens. The requested permissions (access to local filesystem image paths and ability to run a compiled binary) are proportional to the stated purpose.

✓ Persistence & Privilege

The skill does not request persistent global privileges, does not set always: true, and does not modify other skills or system-wide settings. It compiles a binary into its own scripts directory, which is expected for this kind of skill.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install vision-tagger
After installation, invoke the skill by name or use /vision-tagger
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

Initial release: Face, body, hand detection + OCR + scene labels using Apple Vision framework

Metadata

Slug vision-tagger

Version 1.0.0

License —

All-time Installs 9

Active Installs 9

Total Versions 1

Frequently Asked Questions

What is Vision Tagger?

Tag and annotate images using Apple Vision framework (macOS only). Detects faces, bodies, hands, text (OCR), barcodes, objects, scene labels, and saliency re... It is an AI Agent Skill for Claude Code / OpenClaw, with 1368 downloads so far.

How do I install Vision Tagger?

Run "/install vision-tagger" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Vision Tagger free?

Yes, Vision Tagger is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Vision Tagger support?

Vision Tagger is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Vision Tagger?

It is built and maintained by Sagar Jha (@sagarjhaa); the current version is v1.0.0.

More Skills