PDF OCR Parse
/install pdf-ocr-parse
PDF OCR Parse
What It Does
Rasterises each selected page of a PDF at the given DPI, then runs Tesseract OCR on each page image. Returns per-page text with confidence scores, and optionally per-word bounding boxes.
When to Use
- Extract text from scanned PDF documents
- OCR invoices, receipts, or legacy documents in PDF format
- Extract digits-only data (invoice amounts) with char_whitelist
- Process multi-language documents
Required Inputs
Provide one of:
url— URL to a scanned PDFbase64_pdf— base64-encoded PDF- Multipart upload with
filefield
Authentication
Send your API key in the CLIENT-API-KEY header.
Get your free API key at https://pdfapihub.com. Full API documentation is available at https://pdfapihub.com/docs.
Use Cases
- Scanned Invoice Processing — OCR scanned PDF invoices to extract text for accounting systems
- Legacy Document Digitization — Convert old scanned paper documents into searchable text
- Insurance Claims — Extract text from scanned claim forms and medical documents
- Legal Discovery — OCR scanned legal documents for full-text search and review
- Multi-Language Documents — Process documents in Hindi, French, German, etc. with language-specific models
- Form Digitization — Extract filled field values from scanned paper forms
Tesseract Configuration
| Param | Default | Description |
|---|---|---|
lang |
eng |
Language code(s), + separated |
psm |
3 |
Page segmentation mode (0–13) |
oem |
3 |
OCR engine mode (0=legacy, 1=LSTM, 3=default) |
dpi |
200 |
Rasterisation DPI (72–400) |
char_whitelist |
— | Restrict to specific characters |
Example Usage
curl -X POST https://pdfapihub.com/api/v1/pdf/ocr/parse \
-H "CLIENT-API-KEY: your_api_key" \
-H "Content-Type: application/json" \
-d '{
"url": "https://pdfapihub.com/sample-pdfinvoice-with-image.pdf",
"pages": "1-3",
"lang": "eng",
"dpi": 300,
"detail": "words"
}'
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install pdf-ocr-parse - 安装完成后,直接呼叫该 Skill 的名称或使用
/pdf-ocr-parse触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
PDF OCR Parse 是什么?
Extract text from scanned PDFs using Tesseract OCR. Supports multiple languages, page selection, DPI control, and word-level bounding boxes. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 79 次。
如何安装 PDF OCR Parse?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install pdf-ocr-parse」即可一键安装,无需额外配置。
PDF OCR Parse 是免费的吗?
是的,PDF OCR Parse 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
PDF OCR Parse 支持哪些平台?
PDF OCR Parse 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 PDF OCR Parse?
由 Rishabh Dugar(@rishabhdugar)开发并维护,当前版本 v1.0.0。