← Back to Skills Marketplace

Ocr Document

Name: Ocr Document
Author: tanis90

by tanis90 · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ✓ Security Clean

339

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install ocr-document

Description

OCR document extraction - extract text from scanned documents, photos, and images using OCR. Use when reading scanned PDFs, photographed pages, handwritten n...

README (SKILL.md)

OCR Document - Extract Text from Scanned Documents and Images

Extract text from scanned documents and images using OCR via MinerU Open API. No API key required.

Quick Start

# OCR a scanned PDF
mineru-open-api flash-extract scanned.pdf

# OCR an image of a document
mineru-open-api flash-extract page-photo.jpg

# OCR from URL (no download needed)
mineru-open-api flash-extract https://example.com/scanned.pdf

# Specify language for better accuracy
mineru-open-api flash-extract scanned.pdf --language en

# Save OCR result to file
mineru-open-api flash-extract scanned.pdf -o ./output/

Language Rule

You MUST reply to the user in the SAME language they use. This is non-negotiable.

Capabilities

OCR for scanned PDFs, photographed documents, images
Supports PDF, PNG, JPG, WebP, BMP, TIFF
Supports both local files and URLs directly
Language hint with --language (default: ch, use en for English)
No API key, no signup, no authentication
Max 10MB / 20 pages per document

When to Use

User asks to "OCR" a document or image
User has a scanned PDF that needs text extraction
User shares a photo of a page and wants the text
User mentions "scan", "handwriting", or "recognize text"

CLI Reference

Run mineru-open-api flash-extract --help for all available options.

Data Privacy

flash-extract uploads the document to MinerU's cloud API for processing and returns the result. No account or API key is required.
Documents are processed in real-time and are not stored after extraction.
For details, see https://mineru.net

Notes

Best results with clear, high-resolution scans
For higher precision OCR with full layout preservation, use mineru-open-api extract --ocr (requires auth via mineru-open-api auth)
If the CLI cannot be installed via npm/uv/go, download it from https://mineru.net/ecosystem?tab=cli

Usage Guidance

This skill appears internally consistent for cloud-based OCR, but it uploads whatever you OCR to MinerU's servers with no authentication. Before installing or using it: (1) avoid sending sensitive or regulated data (PII, financial, health, legal) unless you trust MinerU; (2) review mineru.net, the npm package and the GitHub repo referenced by the go install to confirm authenticity and reputation; (3) if you need local-only processing, prefer an OCR tool that runs entirely locally; (4) when installing, verify package sources and checksums where available. Confidence is medium because this is an instruction-only skill relying on an external binary we cannot inspect here.

Capability Analysis

Type: OpenClaw Skill Name: ocr-document Version: 1.0.0 The skill provides OCR (Optical Character Recognition) capabilities by wrapping the 'mineru-open-api' CLI tool. It transparently documents that files are uploaded to the MinerU cloud API (mineru.net) for processing, which is the intended behavior for this service. No evidence of malicious intent, unauthorized data exfiltration, or prompt injection was found in the SKILL.md or metadata.

Capability Assessment

✓ Purpose & Capability

The name/description (OCR extraction) aligns with the single required binary (mineru-open-api) and the listed install packages (npm/uv/go for mineru-open-api). No unrelated credentials, binaries, or config paths are requested.

ℹ Instruction Scope

SKILL.md instructs the agent to run mineru-open-api flash-extract on local files or URLs and to respond in the user's language. It explicitly states that documents are uploaded to MinerU's cloud for processing. The instructions do not attempt to read unrelated files or environment variables, but they do send user files off-host — which is necessary for the stated cloud OCR capability.

ℹ Install Mechanism

Installers are standard package mechanisms (npm, go install, 'uv' entry provided). These are moderate-risk installs because they fetch and install third-party code that will create a binary. The go package references a GitHub repo that matches the MinerU/Ecosystem naming; the npm package name matches the binary. No raw download-from-IP or shortener URLs are used in SKILL.md.

✓ Credentials

No environment variables, secrets, or unrelated credentials are requested. The lack of required credentials is consistent with the SKILL.md claim that no API key is required.

✓ Persistence & Privilege

The skill does not request always:true and does not modify other skills or system-wide configuration. It requires installing a CLI binary which will live on disk, but this is proportional to its functionality.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install ocr-document
After installation, invoke the skill by name or use /ocr-document
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

- Initial release of ocr_document skill for OCR text extraction from scanned documents, images, and handwritten notes. - Supports PDF, PNG, JPG, WebP, BMP, and TIFF formats from local files or URLs. - No API key, signup, or authentication required. - Language selection available for improved accuracy; replies always match the user's language. - Maximum file size is 10MB or 20 pages per document. - Powered by the MinerU Open API CLI; installation guides provided for npm, uv, go, and direct download.

Metadata

Slug ocr-document

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Ocr Document?

OCR document extraction - extract text from scanned documents, photos, and images using OCR. Use when reading scanned PDFs, photographed pages, handwritten n... It is an AI Agent Skill for Claude Code / OpenClaw, with 339 downloads so far.

How do I install Ocr Document?

Run "/install ocr-document" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Ocr Document free?

Yes, Ocr Document is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Ocr Document support?

Ocr Document is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Ocr Document?

It is built and maintained by tanis90 (@tanis90); the current version is v1.0.0.

More Skills