← Back to Skills Marketplace

Ocr Scanner Image

Name: Ocr Scanner Image
Author: kaarl92

by kaarl92 · GitHub ↗ · v1.0.1 · MIT-0

cross-platform ⚠ suspicious

238

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install ocr-scanner-image

Description

Perform OCR on image files (jpg, png, bmp, gif, tiff) using the system's `tesseract` binary and return extracted plain text.

Usage Guidance

This skill's primary wrapper (scripts/ocr.sh) performs local OCR using tesseract and is consistent with the description—use this if you want offline processing. Before installing or running: (1) Ensure you have tesseract and pdftoppm (or equivalent) installed—the SKILL metadata does not declare these but the scripts depend on them. (2) Inspect and avoid running scripts/example.py on sensitive images: it uploads files to the external ocr.space API using a public demo key, which will transmit your image contents off-host. (3) If you only want local OCR, delete or ignore example.py and run ocr.sh directly. (4) Be cautious about adding the optional alias to your shell config—it's safe but modifies your shell environment. If you want more assurance, ask the skill author to (a) declare required binaries in metadata, (b) remove or clearly document the network-upload example, or (c) provide a pure-local example only.

Capability Analysis

Type: OpenClaw Skill Name: ocr-scanner-image Version: 1.0.1 The skill provides OCR functionality using either a local Tesseract binary (scripts/ocr.sh) or the external OCR.space API (scripts/example.py). While the Python script sends file data to an external endpoint (api.ocr.space), this behavior is clearly documented in references/api_reference.md and is consistent with the tool's stated purpose. The bash script handles local processing safely with proper quoting, and no evidence of malicious intent, hidden exfiltration, or prompt injection was found.

Capability Assessment

ℹ Purpose & Capability

The stated purpose is local OCR via the system tesseract binary and the provided ocr.sh wrapper implements that (and also PDF→PNG conversion via pdftoppm). However, the package also contains scripts/example.py which uses the external ocr.space API (network call) and a demo API key; that behavior is not described in SKILL.md and is not necessary for the stated local-tesseract purpose.

⚠ Instruction Scope

SKILL.md instructs the agent/user to run the included bash wrapper (ocr.sh) which operates locally and prints output to stdout. It does not mention uploading files to external services. The presence of example.py that will POST local files to a remote OCR API means there is code in the skill that would transmit image contents off-host—this is out-of-band relative to the SKILL.md guidance and is a potential privacy/exfiltration risk if run without understanding.

✓ Install Mechanism

There is no install spec (instruction-only), which minimizes installation risk. The skill ships scripts that will run from disk, but nothing is downloaded or installed automatically.

ℹ Credentials

The skill requests no environment variables or credentials. The example Python script embeds a public demo API key ('helloworld') which is not secret but does cause local files to be uploaded to a third-party service if used. Also, SKILL.md and scripts implicitly require system binaries (tesseract, pdftoppm, possibly pdftoppm/png conversion tools) even though the registry metadata lists none—this omission is a proportionality/information gap to be aware of.

✓ Persistence & Privilege

The skill is not always-enabled, does not request elevated or persistent agent privileges, and only suggests an optional shell alias (editing ~/.bashrc) if the user chooses to do so.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install ocr-scanner-image
After installation, invoke the skill by name or use /ocr-scanner-image
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.1

- Implemented a working OCR scanner skill using a local Bash script (`ocr.sh`) that utilizes `tesseract` for text extraction from image files. - Updated documentation in SKILL.md to reflect actual usage, options, and integration steps with system aliases. - Removed placeholder and structuring guidance from documentation, providing concrete, ready-to-use instructions. - The skill now provides immediate OCR capability for JPG, PNG, BMP, GIF, and TIFF files using a local Tesseract installation.

v1.0.0

Initial release of ocr-scanner-image. - Perform OCR on image files (jpg, png, bmp, gif, tiff) and return extracted text. - Supports images such as screenshots, documents, receipts, and handwritten notes. - Accepts image uploads or URLs for processing. - Offers optional language selection for OCR.

Metadata

Slug ocr-scanner-image

Version 1.0.1

License MIT-0

All-time Installs 1

Active Installs 0

Total Versions 2

Frequently Asked Questions

What is Ocr Scanner Image?

Perform OCR on image files (jpg, png, bmp, gif, tiff) using the system's `tesseract` binary and return extracted plain text. It is an AI Agent Skill for Claude Code / OpenClaw, with 238 downloads so far.

How do I install Ocr Scanner Image?

Run "/install ocr-scanner-image" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Ocr Scanner Image free?

Yes, Ocr Scanner Image is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Ocr Scanner Image support?

Ocr Scanner Image is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Ocr Scanner Image?

It is built and maintained by kaarl92 (@kaarl92); the current version is v1.0.1.

More Skills