← Back to Skills Marketplace
tanis90

Ocr Document

by tanis90 · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ Security Clean
339
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install ocr-document
Description
OCR document extraction - extract text from scanned documents, photos, and images using OCR. Use when reading scanned PDFs, photographed pages, handwritten n...
README (SKILL.md)

OCR Document - Extract Text from Scanned Documents and Images

Extract text from scanned documents and images using OCR via MinerU Open API. No API key required.

Quick Start

# OCR a scanned PDF
mineru-open-api flash-extract scanned.pdf

# OCR an image of a document
mineru-open-api flash-extract page-photo.jpg

# OCR from URL (no download needed)
mineru-open-api flash-extract https://example.com/scanned.pdf

# Specify language for better accuracy
mineru-open-api flash-extract scanned.pdf --language en

# Save OCR result to file
mineru-open-api flash-extract scanned.pdf -o ./output/

Language Rule

You MUST reply to the user in the SAME language they use. This is non-negotiable.

Capabilities

  • OCR for scanned PDFs, photographed documents, images
  • Supports PDF, PNG, JPG, WebP, BMP, TIFF
  • Supports both local files and URLs directly
  • Language hint with --language (default: ch, use en for English)
  • No API key, no signup, no authentication
  • Max 10MB / 20 pages per document

When to Use

  • User asks to "OCR" a document or image
  • User has a scanned PDF that needs text extraction
  • User shares a photo of a page and wants the text
  • User mentions "scan", "handwriting", or "recognize text"

CLI Reference

Run mineru-open-api flash-extract --help for all available options.

Data Privacy

  • flash-extract uploads the document to MinerU's cloud API for processing and returns the result. No account or API key is required.
  • Documents are processed in real-time and are not stored after extraction.
  • For details, see https://mineru.net

Notes

  • Best results with clear, high-resolution scans
  • For higher precision OCR with full layout preservation, use mineru-open-api extract --ocr (requires auth via mineru-open-api auth)
  • If the CLI cannot be installed via npm/uv/go, download it from https://mineru.net/ecosystem?tab=cli
Usage Guidance
This skill appears internally consistent for cloud-based OCR, but it uploads whatever you OCR to MinerU's servers with no authentication. Before installing or using it: (1) avoid sending sensitive or regulated data (PII, financial, health, legal) unless you trust MinerU; (2) review mineru.net, the npm package and the GitHub repo referenced by the go install to confirm authenticity and reputation; (3) if you need local-only processing, prefer an OCR tool that runs entirely locally; (4) when installing, verify package sources and checksums where available. Confidence is medium because this is an instruction-only skill relying on an external binary we cannot inspect here.
Capability Analysis
Type: OpenClaw Skill Name: ocr-document Version: 1.0.0 The skill provides OCR (Optical Character Recognition) capabilities by wrapping the 'mineru-open-api' CLI tool. It transparently documents that files are uploaded to the MinerU cloud API (mineru.net) for processing, which is the intended behavior for this service. No evidence of malicious intent, unauthorized data exfiltration, or prompt injection was found in the SKILL.md or metadata.
Capability Assessment
Purpose & Capability
The name/description (OCR extraction) aligns with the single required binary (mineru-open-api) and the listed install packages (npm/uv/go for mineru-open-api). No unrelated credentials, binaries, or config paths are requested.
Instruction Scope
SKILL.md instructs the agent to run mineru-open-api flash-extract on local files or URLs and to respond in the user's language. It explicitly states that documents are uploaded to MinerU's cloud for processing. The instructions do not attempt to read unrelated files or environment variables, but they do send user files off-host — which is necessary for the stated cloud OCR capability.
Install Mechanism
Installers are standard package mechanisms (npm, go install, 'uv' entry provided). These are moderate-risk installs because they fetch and install third-party code that will create a binary. The go package references a GitHub repo that matches the MinerU/Ecosystem naming; the npm package name matches the binary. No raw download-from-IP or shortener URLs are used in SKILL.md.
Credentials
No environment variables, secrets, or unrelated credentials are requested. The lack of required credentials is consistent with the SKILL.md claim that no API key is required.
Persistence & Privilege
The skill does not request always:true and does not modify other skills or system-wide configuration. It requires installing a CLI binary which will live on disk, but this is proportional to its functionality.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install ocr-document
  3. After installation, invoke the skill by name or use /ocr-document
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
- Initial release of ocr_document skill for OCR text extraction from scanned documents, images, and handwritten notes. - Supports PDF, PNG, JPG, WebP, BMP, and TIFF formats from local files or URLs. - No API key, signup, or authentication required. - Language selection available for improved accuracy; replies always match the user's language. - Maximum file size is 10MB or 20 pages per document. - Powered by the MinerU Open API CLI; installation guides provided for npm, uv, go, and direct download.
Metadata
Slug ocr-document
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Ocr Document?

OCR document extraction - extract text from scanned documents, photos, and images using OCR. Use when reading scanned PDFs, photographed pages, handwritten n... It is an AI Agent Skill for Claude Code / OpenClaw, with 339 downloads so far.

How do I install Ocr Document?

Run "/install ocr-document" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Ocr Document free?

Yes, Ocr Document is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Ocr Document support?

Ocr Document is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Ocr Document?

It is built and maintained by tanis90 (@tanis90); the current version is v1.0.0.

💬 Comments