← Back to Skills Marketplace

PaddleOCR Document Parsing

Name: PaddleOCR Document Parsing
Author: bobholamovic

by Lin Manhui · GitHub ↗ · v3.0.0 · MIT-0

cross-platform ✓ Security Clean

10267

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install paddleocr-doc-parsing

Description

Use this skill to extract structured Markdown/JSON from PDFs and document images—tables with cell-level precision, formulas as LaTeX, figures, seals, charts,...

Usage Guidance

Install only if you trust the PaddleOCR CLI package and are comfortable using a hosted OCR API with a PaddleOCR access token. Do not process confidential, regulated, financial, legal, or customer documents unless you have confirmed the data-sharing, retention, and compliance terms are acceptable.

Capability Assessment

✓ Purpose & Capability

The stated purpose is extracting structured Markdown/JSON from PDFs and document images, and the artifact only contains PaddleOCR CLI instructions for that task.

ℹ Instruction Scope

Commands are user-directed and scoped to explicit URLs, local file paths, page ranges, and output locations; the skill should more plainly warn that local files submitted with `paddleocr api --file_path` may be sent to a remote OCR service.

ℹ Install Mechanism

Installation declares the `paddleocr` package via `uv`, the `paddleocr` binary, and `PADDLEOCR_ACCESS_TOKEN`; no helper scripts or hidden executable artifacts are included.

ℹ Credentials

Use of an API token and cloud OCR is proportionate for hosted document parsing, but the target documents may include invoices, financial reports, and other sensitive content.

✓ Persistence & Privilege

There is no evidence of background persistence, privilege escalation, broad filesystem indexing, credential harvesting, destructive actions, or automatic execution.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install paddleocr-doc-parsing
After installation, invoke the skill by name or use /paddleocr-doc-parsing
Provide required inputs per the skill's parameter spec and get structured output

Version History

v3.0.0

- Significant update: migrated from custom scripts and detailed workflow to usage of the official PaddleOCR CLI. - Removed all helper scripts and schema/reference files. - Updated instructions for document parsing using the new paddleocr api command-line interface. - Simplified configuration: only requires PADDLEOCR_ACCESS_TOKEN and paddleocr CLI. - Added quick-start usage examples with key CLI options and new output format. - Clarified error handling and preprocessing recommendations.

v2.0.16

- Switched installation and dependency management to inline PEP 723 requirements, eliminating the separate requirements.txt files. - Updated usage instructions: scripts are now executed via uv (Python 3.9+ required), replacing previous pip/python guidance. - Clarified compatibility: now requires Python 3.9+, uv, and internet access. - Added SPDX license field (Apache-2.0). - Cleaned up documentation by removing references to installation via requirements files and updating all example commands to use uv.

v2.0.15

- Replaced the main document parsing script: removed scripts/vl_caller.py and added scripts/layout_caller.py. - All usage and documentation instructions now reference layout_caller.py instead of vl_caller.py. - No other workflow or API changes; skill functionality and interface remain the same.

v2.0.14

No file changes detected in this version. - No user-facing or implementation changes. - Documentation, API, and skill behavior remain unchanged.

v2.0.13

No changes detected in this version. - Version bumped to 2.0.13 with no modifications to files or documentation. - Behavior, usage, and installation remain unchanged.

v2.0.12

- Documentation streamlined for clarity and usability—workflow, troubleshooting, and schema details are now easier to find. - Output fields renamed to match actual script results (e.g., layoutParsingResults) and old field names removed. - More concise and focused instructions: Removed redundant warnings, emphasized complete user-facing output, and added timing guidance for large files. - Detailed first-time setup, error handling, and output parsing sections improved for newcomers and troubleshooting. - Usage examples refreshed; file persistence and `--stdout` handling clarified.

v2.0.11

- Dependency installation has been streamlined: requirements.txt and requirements-optimize.txt are now located at the skill's root, not inside scripts/. - Updated installation instructions to reflect new requirements file locations. - Unused scripts/requirements*.txt files have been removed for clarity. - No changes to skill functionality or API usage.

v2.0.10

No user-facing changes in this version. - No updates detected in files or documentation. - Functionality, usage instructions, and requirements remain unchanged.

v2.0.9

No functional changes detected in this version. - Updated the skill description with more concise, keyword-rich, and bilingual (Chinese/English) trigger terms for improved discovery and routing. - Added an explicit note in the documentation directing routing/discovery logic to use the `description` field for trigger keywords. - No code or behavioral changes; skill usage and workflow remain the same.

v2.0.8

No user-facing changes in this version. - Version bump; no file or documentation changes detected.

v2.0.7

- Added installation instructions for required Python dependencies in the usage guide. - Included optional installation step for document optimization and PDF splitting support. - No changes to API, features, or workflow.

v2.0.6

- Removed the configuration script (scripts/configure.py) from the project. - Users are now expected to set environment variables using standard host application methods. - Documentation updated to clarify secure configuration and that configure.py is no longer used or required. - No functional changes to document parsing workflow.

v2.0.5

### paddleocr-doc-parsing 2.0.5 Changelog - Updated the description for improved clarity and conciseness. - The configuration instructions now assume environment variables are typically pre-configured; only notify about configuration issues if an error occurs during parsing. - Standardized error messages and clarified the configuration workflow for better user guidance. - Minor formatting and wording improvements throughout documentation.

v2.0.4

No user-visible changes in this version. - No file changes detected; documentation, functionality, and usage remain the same.

v2.0.3

- Added metadata section to SKILL.md, specifying required environment variables, dependencies, homepage, and emoji. - No changes to functionality or core documentation content. - Version bump for manifest metadata improvement and compatibility.

v2.0.2

No user-visible changes in this version. - Content and workflow remain unchanged. - No modifications or updates were detected in the skill files.

v2.0.1

- Added Openclaw-compatible metadata, including required environment variables and binaries. - Declared environment variables needed: PADDLEOCR_DOC_PARSING_API_URL, PADDLEOCR_ACCESS_TOKEN, PADDLEOCR_DOC_PARSING_TIMEOUT. - Added metadata section with homepage link and primary environment variable. - No changes to any code or documentation content outside of the metadata addition.

v2.0.0

Version 2.0.0 – Major update: Migrated from bash to Python, improved modular parsing, and enforced strict usage rules. - Replaced legacy bash script (paddleocr_parse.sh) with a Python-based orchestration (e.g., vl_caller.py). - Added configurable scripts for document parsing, PDF splitting, configuration, and optimization. - Introduced a clear API usage policy: only use script-based API calls; no direct parsing or fallback methods allowed. - Enhanced documentation with new setup, workflow, usage constraints, and error-handling guidelines. - Provided schema references and structured output for easier downstream processing.

v1.0.3

- No changes detected in this version; all features and documentation remain the same as the previous release.

v1.0.2

- No changes detected in this release.

Metadata

Slug paddleocr-doc-parsing

Version 3.0.0

License MIT-0

All-time Installs 345

Active Installs 78

Total Versions 22

Frequently Asked Questions

What is PaddleOCR Document Parsing?

Use this skill to extract structured Markdown/JSON from PDFs and document images—tables with cell-level precision, formulas as LaTeX, figures, seals, charts,... It is an AI Agent Skill for Claude Code / OpenClaw, with 10267 downloads so far.

How do I install PaddleOCR Document Parsing?

Run "/install paddleocr-doc-parsing" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is PaddleOCR Document Parsing free?

Yes, PaddleOCR Document Parsing is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does PaddleOCR Document Parsing support?

PaddleOCR Document Parsing is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created PaddleOCR Document Parsing?

It is built and maintained by Lin Manhui (@bobholamovic); the current version is v3.0.0.

More Skills