← Back to Skills Marketplace

Smart PDF OCR

Name: Smart PDF OCR
Author: veeicwgy

by veeicwgy · GitHub ↗ · v0.2.0 · MIT-0

cross-platform ⚠ suspicious

120

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install smart-pdf-ocr

Description

Intelligent PDF OCR powered by MinerU API. Extract text from scanned PDFs, image-based PDFs, and photographed documents using mineru-open-api CLI with advanc...

Usage Guidance

This skill appears to do what it says (run the mineru-open-api CLI to OCR PDFs), but exercise caution before installing. Verify the npm package: check the mineru-open-api package page, author, and repository on the npm registry or GitHub; prefer installing in a sandbox/container rather than globally on a production system; do not run the install as root. Ask the skill author or maintainer how advanced/precision OCR is authenticated (which env var or token the CLI uses) and where credentials are stored; avoid supplying sensitive API keys unless you can confirm the package's source and trustworthiness. If you cannot verify the package origin, consider alternative, well-known OCR tools (Tesseract, Google/Adobe official CLIs) or run the tool in an isolated VM.

Capability Analysis

Type: OpenClaw Skill Name: smart-pdf-ocr Version: 0.2.0 The skill bundle provides instructions for an AI agent to perform OCR on PDF documents using the 'mineru-open-api' CLI tool. It includes standard installation steps via npm and defines clear workflows for different OCR tasks (flash, precision, and VLM-based extraction) without any signs of malicious intent, data exfiltration, or prompt injection.

Capability Assessment

ℹ Purpose & Capability

The skill claims MinerU-powered OCR and the SKILL.md explicitly uses the mineru-open-api CLI with commands that align with that purpose (flash-extract, extract, --ocr, --model). However, the metadata declares no primary credential or environment requirements while the README implies advanced features use the MinerU API (which typically requires an API token). This mismatch is unexplained.

ℹ Instruction Scope

Instructions are narrowly scoped to installing the mineru-open-api CLI and running it against user PDFs, creating an output directory under the user's home. The SKILL.md does not instruct the agent to read unrelated system files or exfiltrate data. Concern: it omits details on how to supply API credentials for advanced/precision extracts, so the agent (or user) may need to supply secrets or the CLI may prompt — that behavior is not documented here.

⚠ Install Mechanism

The SKILL.md tells users to run `npm install -g mineru-open-api`. Installing an arbitrary global npm package executes third-party code on the host and is a moderate-risk operation unless the package and publisher are verified. The skill has no install spec or verified homepage/source in its metadata to confirm the package origin.

ℹ Credentials

No environment variables or credentials are declared in the metadata, which is reasonable for quick/no-token flash-extract. But the skill advertises advanced OCR powered by MinerU (VLM/pipeline models) which almost certainly requires API keys or tokens; the absence of any guidance or declared env vars for providing those secrets is an unexplained omission.

✓ Persistence & Privilege

The skill does not request always: true, no install spec in the registry, and it does not claim to modify other skills or system-wide settings. Creating an output directory under the user's home is expected for file output.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install smart-pdf-ocr
After installation, invoke the skill by name or use /smart-pdf-ocr
Provide required inputs per the skill's parameter spec and get structured output

Version History

v0.2.0

Added MinerU API integration via mineru-open-api CLI for PDF OCR

v0.1.0

Initial release

Metadata

Slug smart-pdf-ocr

Version 0.2.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 2

Frequently Asked Questions

What is Smart PDF OCR?

Intelligent PDF OCR powered by MinerU API. Extract text from scanned PDFs, image-based PDFs, and photographed documents using mineru-open-api CLI with advanc... It is an AI Agent Skill for Claude Code / OpenClaw, with 120 downloads so far.

How do I install Smart PDF OCR?

Run "/install smart-pdf-ocr" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Smart PDF OCR free?

Yes, Smart PDF OCR is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Smart PDF OCR support?

Smart PDF OCR is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Smart PDF OCR?

It is built and maintained by veeicwgy (@veeicwgy); the current version is v0.2.0.

More Skills