← Back to Skills Marketplace

image2text

Name: image2text
Author: caiming0331

by caiming0331 · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ✓ Security Clean

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install image2text

Description

Extract text from images using tesseract OCR, supporting local files, URLs, and base64 inputs for text-only AI models without vision capability.

README (SKILL.md)

image2text

Extract text from images without needing a vision-capable AI model.

Usage

python3 scripts/ocr.py \x3Cimage path|URL|base64> [--lang \x3Clanguages>] [--psm \x3Cmode>] [--raw]

Parameters

--lang: Language codes, comma-separated, default chi_sim+eng
- chi_sim Simplified Chinese | chi_tra Traditional | eng English | jpn Japanese | kor Korean | and 30+ more
- Combine: chi_sim+eng
--psm: Page segmentation mode, default 6
- 3 Fully automatic | 6 Block-level | 4 Single line | 11 Sparse text
--raw: Output plain text only, no markers

Auto-Detects Input Type

Local path: /Users/xxx/Downloads/xxx.png
Web URL: https://example.com/image.png — OSS temp links work too
Base64: Pasted image data from clipboard — just paste directly

Workflow

Receive image input → auto-detect type (local path / URL / base64)
URL → curl downloads to temp file
Base64 → decode to temp file
Run tesseract OCR
Output plain text

Examples

OCR a Chinese receipt:

python3 scripts/ocr.py ~/Downloads/receipt.png --lang chi_sim

English + Chinese mixed:

python3 scripts/ocr.py https://example.com/doc.jpg --lang chi_sim+eng

Plain text only (no markers):

python3 scripts/ocr.py /path/to/image.png --raw

Requirements

tesseract must be installed: brew install tesseract
Language packs auto-installed with tesseract
On Mac: binary at /opt/homebrew/bin/tesseract
Temp files auto-deleted after execution
For best accuracy on receipts/screenshots: try --psm 3

Usage Guidance

This skill appears to do exactly what it says: local OCR via your system tesseract. Before installing/using it: (1) ensure tesseract and any language packs you need are installed locally; (2) do not pass untrusted URLs or pasted base64 from unknown sources (the script will download and process whatever URL you supply); (3) be aware the script calls subprocesses (curl as a fallback and tesseract) and writes temporary files which it deletes; and (4) no credentials are requested, and results are printed locally (no external transmission coded into the skill). If you need automatic fetching from arbitrary web locations in a sensitive environment, consider restricting allowed sources or reviewing network policies first.

Capability Analysis

Type: OpenClaw Skill Name: image2text Version: 1.0.0 The image2text skill is a legitimate utility for performing OCR on local files, URLs, or base64-encoded images using Tesseract. The Python script (scripts/ocr.py) handles external inputs safely by using subprocess.run with argument lists instead of shell execution, and it includes proper cleanup of temporary files. No evidence of malicious intent, data exfiltration, or prompt injection was found.

Capability Assessment

✓ Purpose & Capability

Name, description, SKILL.md, and the included script all describe the same functionality: take a local path/URL/base64 input, download or decode it to a temp file, run local tesseract, and return extracted text. Required capabilities (tesseract binary) are consistent with the purpose; no unrelated env vars or credentials are requested.

ℹ Instruction Scope

Runtime instructions and the script stay within OCR scope: they accept local/URL/base64 inputs, download or decode to temp files, run tesseract, and output text. The script will download arbitrary URLs supplied by the user (urllib or curl) and invokes subprocesses (curl, tesseract). These behaviors are expected for a URL-capable OCR tool but mean the agent will fetch remote data you provide — avoid passing untrusted URLs or base64 content.

✓ Install Mechanism

There is no install specification; the skill is instruction-only and ships a small Python script. The only external dependency is the system tesseract binary (SKILL.md suggests brew install on mac). No downloaded archives or non-standard installers are used.

✓ Credentials

The skill requires no environment variables, credentials, or config paths. It only uses system binaries (curl if urllib fails, and tesseract) and temporary files; requested permissions are proportional to its stated function.

✓ Persistence & Privilege

always is false and the skill does not attempt to modify other skills, global agent config, or persist credentials. It writes temporary files during execution and deletes them in the finally block.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install image2text
After installation, invoke the skill by name or use /image2text
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

Initial release — extract text from any image using tesseract OCR. Supports local paths, URLs (OSS/http/https), and base64 clipboard input. Works with text-only AI models that lack vision capability. 30+ languages supported.

Metadata

Slug image2text

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is image2text?

Extract text from images using tesseract OCR, supporting local files, URLs, and base64 inputs for text-only AI models without vision capability. It is an AI Agent Skill for Claude Code / OpenClaw, with 86 downloads so far.

How do I install image2text?

Run "/install image2text" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is image2text free?

Yes, image2text is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does image2text support?

image2text is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created image2text?

It is built and maintained by caiming0331 (@caiming0331); the current version is v1.0.0.

More Skills

image2text

image2text

Usage

Parameters

Auto-Detects Input Type

Workflow

Examples

Requirements

What is image2text?

How do I install image2text?

Is image2text free?

Which platforms does image2text support?

Who created image2text?

💬 Comments