← 返回 Skills 市场

PDF OCR Parse

Name: PDF OCR Parse
Author: rishabhdugar

作者 Rishabh Dugar · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ✓ 安全检测通过

总下载

当前安装

版本数

在 OpenClaw 中安装

/install pdf-ocr-parse

功能描述

Extract text from scanned PDFs using Tesseract OCR. Supports multiple languages, page selection, DPI control, and word-level bounding boxes.

使用说明 (SKILL.md)

PDF OCR Parse

What It Does

Rasterises each selected page of a PDF at the given DPI, then runs Tesseract OCR on each page image. Returns per-page text with confidence scores, and optionally per-word bounding boxes.

When to Use

Extract text from scanned PDF documents
OCR invoices, receipts, or legacy documents in PDF format
Extract digits-only data (invoice amounts) with char_whitelist
Process multi-language documents

Required Inputs

Provide one of:

url — URL to a scanned PDF
base64_pdf — base64-encoded PDF
Multipart upload with file field

Authentication

Send your API key in the CLIENT-API-KEY header.

Get your free API key at https://pdfapihub.com. Full API documentation is available at https://pdfapihub.com/docs.

Use Cases

Scanned Invoice Processing — OCR scanned PDF invoices to extract text for accounting systems
Legacy Document Digitization — Convert old scanned paper documents into searchable text
Insurance Claims — Extract text from scanned claim forms and medical documents
Legal Discovery — OCR scanned legal documents for full-text search and review
Multi-Language Documents — Process documents in Hindi, French, German, etc. with language-specific models
Form Digitization — Extract filled field values from scanned paper forms

Tesseract Configuration

Param	Default	Description
`lang`	`eng`	Language code(s), `+` separated
`psm`	`3`	Page segmentation mode (0–13)
`oem`	`3`	OCR engine mode (0=legacy, 1=LSTM, 3=default)
`dpi`	`200`	Rasterisation DPI (72–400)
`char_whitelist`	—	Restrict to specific characters

Example Usage

curl -X POST https://pdfapihub.com/api/v1/pdf/ocr/parse \
  -H "CLIENT-API-KEY: your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://pdfapihub.com/sample-pdfinvoice-with-image.pdf",
    "pages": "1-3",
    "lang": "eng",
    "dpi": 300,
    "detail": "words"
  }'

安全使用建议

This skill appears coherent and only forwards PDFs to pdfapihub.com for OCR using an API key you provide. Before installing or using it: (1) verify and trust the pdfapihub.com service (privacy, retention, and security policies) because any uploaded PDF — potentially containing sensitive data — will be sent to that third party; (2) avoid using production or highly sensitive documents until you’ve tested with non-sensitive samples; (3) manage the API key carefully (use a dedicated key with least privilege and rotate it if possible); and (4) note that the skill owner and homepage are unknown — consider this when deciding whether to trust the service.

功能分析

Type: OpenClaw Skill Name: pdf-ocr-parse Version: 1.0.0 The skill is a standard API wrapper for a cloud-based OCR service (pdfapihub.com). The files (SKILL.md, skill.json) correctly define parameters for Tesseract OCR processing and do not contain any evidence of malicious execution, data exfiltration beyond the stated purpose, or prompt injection attacks.

能力标签

requires-sensitive-credentials

能力评估

✓ Purpose & Capability

The name/description, SKILL.md, example.json, and skill.json all describe the same behavior: submit a PDF (URL/base64/file) to pdfapihub.com for OCR and return text/bounding boxes. The required capabilities (API key in header) match the stated purpose.

✓ Instruction Scope

Runtime instructions are narrowly scoped to uploading or referencing a PDF and configuring Tesseract params (lang, dpi, psm, etc.). The SKILL.md does not instruct the agent to read unrelated files, environment variables, or system state.

✓ Install Mechanism

No install spec or code is included (instruction-only), so nothing is written to disk or fetched during installation. This is the lowest-risk install model and is proportionate for an API-wrapping skill.

ℹ Credentials

No platform environment variables are required, which matches the registry metadata. The skill does require an API key provided in the CLIENT-API-KEY header (skill.json marks auth as required) — this is expected for an external API but is a credential that will be sent to pdfapihub.com and should be provisioned per your security policies.

✓ Persistence & Privilege

always is false and the skill is user-invocable (normal). It does not request persistent system privileges or modify other skills' configuration. Autonomous invocation is permitted by default but does not introduce extra incoherence here.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install pdf-ocr-parse
安装完成后，直接呼叫该 Skill 的名称或使用 /pdf-ocr-parse 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

OCR scanned PDFs using Tesseract. Rasterises pages at configurable DPI, then runs OCR with multi-language support (eng+hin, eng+fra, etc.). Returns per-page text with confidence scores and optional word-level bounding boxes.

元数据

Slug pdf-ocr-parse

版本 1.0.0

许可证 MIT-0

累计安装 1

当前安装数 1

历史版本数 1

常见问题

PDF OCR Parse 是什么？

Extract text from scanned PDFs using Tesseract OCR. Supports multiple languages, page selection, DPI control, and word-level bounding boxes. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 79 次。

如何安装 PDF OCR Parse？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install pdf-ocr-parse」即可一键安装，无需额外配置。

PDF OCR Parse 是免费的吗？

是的，PDF OCR Parse 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

PDF OCR Parse 支持哪些平台？

PDF OCR Parse 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 PDF OCR Parse？

由 Rishabh Dugar（@rishabhdugar）开发并维护，当前版本 v1.0.0。

PDF OCR Parse

PDF OCR Parse

What It Does

When to Use

Required Inputs

Authentication

Use Cases

Tesseract Configuration

Example Usage

PDF OCR Parse 是什么？

如何安装 PDF OCR Parse？

PDF OCR Parse 是免费的吗？

PDF OCR Parse 支持哪些平台？

谁开发了 PDF OCR Parse？

💬 留言讨论