← Back to Skills Marketplace
rishabhdugar

PDF Parse

by Rishabh Dugar · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ Security Clean
91
Downloads
0
Stars
1
Active Installs
1
Versions
Install in OpenClaw
/install pdf-parse
Description
Parse a PDF into structured JSON: text, layout-aware blocks with bounding boxes, tables, and image metadata.
README (SKILL.md)

PDF Parse

What It Does

Parses a PDF into structured JSON with text content, layout-aware blocks (with normalized bounding boxes), tables, and image metadata.

When to Use

  • Extract structured data from PDFs (text, tables, images)
  • Get layout-aware content with bounding box coordinates
  • Parse invoices, forms, or reports into machine-readable format

Parsing Modes

Mode Description
text Text only
layout Text + text blocks with bounding boxes
tables Text + table blocks
full Text + blocks + tables + images (default)

Required Inputs

Provide one of:

  • url — public URL to a PDF
  • Multipart upload with file field

Authentication

Send your API key in the CLIENT-API-KEY header.

Get your free API key at https://pdfapihub.com. Full API documentation is available at https://pdfapihub.com/docs.

Use Cases

  • Invoice Parsing — Extract line items, totals, and vendor info from PDF invoices
  • Resume Parsing — Extract structured data (name, experience, skills) from PDF resumes
  • Contract Analysis — Extract clauses, dates, and parties from legal PDF contracts
  • Form Data Extraction — Pull filled form fields and values from PDF forms
  • Research Paper Analysis — Extract text, tables, and figures from academic PDFs
  • Document Indexing — Parse PDFs into structured JSON for search engine indexing

Example Usage

curl -X POST https://pdfapihub.com/api/v1/pdf/parse \
  -H "CLIENT-API-KEY: your_api_key" \
  -H "Content-Type: application/json" \
  -d '{ "url": "https://pdfapihub.com/sample-pdfinvoice-with-image.pdf", "mode": "full", "pages": "1-3" }'
Usage Guidance
This skill forwards PDFs to a third-party service (pdfapihub.com) and requires an API key in the CLIENT-API-KEY header. Before installing: (1) confirm you trust pdfapihub.com (review their privacy policy and retention practices); (2) avoid sending sensitive or regulated documents unless you’ve verified security/contractual protections; (3) supply the API key via the platform's secure credential storage (do not paste it into chat); (4) test with non-sensitive samples first; and (5) if you need stronger assurance, ask the skill author for a homepage, company identity, and privacy/terms links — the owner is currently unknown.
Capability Analysis
Type: OpenClaw Skill Name: pdf-parse Version: 1.0.0 The skill bundle is a standard wrapper for an external PDF parsing API (pdfapihub.com). It functions as described, facilitating the extraction of text and layout data from PDFs via a REST API, and contains no evidence of malicious intent, obfuscation, or prompt injection.
Capability Tags
requires-sensitive-credentials
Capability Assessment
Purpose & Capability
Name/description match the runtime instructions and example: the skill forwards PDFs (by URL or multipart upload) to pdfapihub.com for parsing and expects an API key in the CLIENT-API-KEY header — this is appropriate for a PDF parsing integration.
Instruction Scope
SKILL.md only instructs POSTing to https://pdfapihub.com/api/v1/pdf/parse with either a public URL or multipart file and an API key. That stays within the stated purpose, but the instructions do not warn that PDFs (which may contain sensitive PII or secrets) will be transmitted to a third-party service — a privacy/data-exfiltration risk users should be aware of.
Install Mechanism
Instruction-only skill with no install spec or code files; nothing is written to disk or downloaded by the skill itself, which minimizes installation risk.
Credentials
The skill requires an API key (header-based) according to SKILL.md and skill.json, but no required environment variables are declared in the registry metadata — this is not dangerous but is a minor inconsistency in how credentials are represented. Requesting an API key for the external service is proportionate to the functionality.
Persistence & Privilege
always:false and no install-time modifications or system paths requested. The skill does perform outbound network calls to pdfapihub.com when invoked (expected for its purpose).
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install pdf-parse
  3. After installation, invoke the skill by name or use /pdf-parse
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Parse PDFs into structured JSON with text, layout-aware blocks (with normalized bounding boxes), tables, and image metadata. Modes: text, layout, tables, full. Supports page selection.
Metadata
Slug pdf-parse
Version 1.0.0
License MIT-0
All-time Installs 1
Active Installs 1
Total Versions 1
Frequently Asked Questions

What is PDF Parse?

Parse a PDF into structured JSON: text, layout-aware blocks with bounding boxes, tables, and image metadata. It is an AI Agent Skill for Claude Code / OpenClaw, with 91 downloads so far.

How do I install PDF Parse?

Run "/install pdf-parse" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is PDF Parse free?

Yes, PDF Parse is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does PDF Parse support?

PDF Parse is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created PDF Parse?

It is built and maintained by Rishabh Dugar (@rishabhdugar); the current version is v1.0.0.

💬 Comments