← Back to Skills Marketplace

PDF Parse

Name: PDF Parse
Author: rishabhdugar

by Rishabh Dugar · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ✓ Security Clean

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install pdf-parse

Description

Parse a PDF into structured JSON: text, layout-aware blocks with bounding boxes, tables, and image metadata.

README (SKILL.md)

PDF Parse

What It Does

Parses a PDF into structured JSON with text content, layout-aware blocks (with normalized bounding boxes), tables, and image metadata.

When to Use

Extract structured data from PDFs (text, tables, images)
Get layout-aware content with bounding box coordinates
Parse invoices, forms, or reports into machine-readable format

Parsing Modes

Mode	Description
`text`	Text only
`layout`	Text + text blocks with bounding boxes
`tables`	Text + table blocks
`full`	Text + blocks + tables + images (default)

Required Inputs

Provide one of:

url — public URL to a PDF
Multipart upload with file field

Authentication

Send your API key in the CLIENT-API-KEY header.

Get your free API key at https://pdfapihub.com. Full API documentation is available at https://pdfapihub.com/docs.

Use Cases

Invoice Parsing — Extract line items, totals, and vendor info from PDF invoices
Resume Parsing — Extract structured data (name, experience, skills) from PDF resumes
Contract Analysis — Extract clauses, dates, and parties from legal PDF contracts
Form Data Extraction — Pull filled form fields and values from PDF forms
Research Paper Analysis — Extract text, tables, and figures from academic PDFs
Document Indexing — Parse PDFs into structured JSON for search engine indexing

Example Usage

curl -X POST https://pdfapihub.com/api/v1/pdf/parse \
  -H "CLIENT-API-KEY: your_api_key" \
  -H "Content-Type: application/json" \
  -d '{ "url": "https://pdfapihub.com/sample-pdfinvoice-with-image.pdf", "mode": "full", "pages": "1-3" }'

Usage Guidance

This skill forwards PDFs to a third-party service (pdfapihub.com) and requires an API key in the CLIENT-API-KEY header. Before installing: (1) confirm you trust pdfapihub.com (review their privacy policy and retention practices); (2) avoid sending sensitive or regulated documents unless you’ve verified security/contractual protections; (3) supply the API key via the platform's secure credential storage (do not paste it into chat); (4) test with non-sensitive samples first; and (5) if you need stronger assurance, ask the skill author for a homepage, company identity, and privacy/terms links — the owner is currently unknown.

Capability Analysis

Type: OpenClaw Skill Name: pdf-parse Version: 1.0.0 The skill bundle is a standard wrapper for an external PDF parsing API (pdfapihub.com). It functions as described, facilitating the extraction of text and layout data from PDFs via a REST API, and contains no evidence of malicious intent, obfuscation, or prompt injection.

Capability Tags

requires-sensitive-credentials

Capability Assessment

✓ Purpose & Capability

Name/description match the runtime instructions and example: the skill forwards PDFs (by URL or multipart upload) to pdfapihub.com for parsing and expects an API key in the CLIENT-API-KEY header — this is appropriate for a PDF parsing integration.

ℹ Instruction Scope

SKILL.md only instructs POSTing to https://pdfapihub.com/api/v1/pdf/parse with either a public URL or multipart file and an API key. That stays within the stated purpose, but the instructions do not warn that PDFs (which may contain sensitive PII or secrets) will be transmitted to a third-party service — a privacy/data-exfiltration risk users should be aware of.

✓ Install Mechanism

Instruction-only skill with no install spec or code files; nothing is written to disk or downloaded by the skill itself, which minimizes installation risk.

ℹ Credentials

The skill requires an API key (header-based) according to SKILL.md and skill.json, but no required environment variables are declared in the registry metadata — this is not dangerous but is a minor inconsistency in how credentials are represented. Requesting an API key for the external service is proportionate to the functionality.

✓ Persistence & Privilege

always:false and no install-time modifications or system paths requested. The skill does perform outbound network calls to pdfapihub.com when invoked (expected for its purpose).

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install pdf-parse
After installation, invoke the skill by name or use /pdf-parse
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

Parse PDFs into structured JSON with text, layout-aware blocks (with normalized bounding boxes), tables, and image metadata. Modes: text, layout, tables, full. Supports page selection.

Metadata

Slug pdf-parse

Version 1.0.0

License MIT-0

All-time Installs 1

Active Installs 1

Total Versions 1

Frequently Asked Questions

What is PDF Parse?

Parse a PDF into structured JSON: text, layout-aware blocks with bounding boxes, tables, and image metadata. It is an AI Agent Skill for Claude Code / OpenClaw, with 91 downloads so far.

How do I install PDF Parse?

Run "/install pdf-parse" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is PDF Parse free?

Yes, PDF Parse is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does PDF Parse support?

PDF Parse is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created PDF Parse?

It is built and maintained by Rishabh Dugar (@rishabhdugar); the current version is v1.0.0.

More Skills

PDF Parse

PDF Parse

What It Does

When to Use

Parsing Modes

Required Inputs

Authentication

Use Cases

Example Usage

What is PDF Parse?

How do I install PDF Parse?

Is PDF Parse free?

Which platforms does PDF Parse support?

Who created PDF Parse?

💬 Comments