← 返回 Skills 市场
rishabhdugar

PDF Parse

作者 Rishabh Dugar · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ 安全检测通过
91
总下载
0
收藏
1
当前安装
1
版本数
在 OpenClaw 中安装
/install pdf-parse
功能描述
Parse a PDF into structured JSON: text, layout-aware blocks with bounding boxes, tables, and image metadata.
使用说明 (SKILL.md)

PDF Parse

What It Does

Parses a PDF into structured JSON with text content, layout-aware blocks (with normalized bounding boxes), tables, and image metadata.

When to Use

  • Extract structured data from PDFs (text, tables, images)
  • Get layout-aware content with bounding box coordinates
  • Parse invoices, forms, or reports into machine-readable format

Parsing Modes

Mode Description
text Text only
layout Text + text blocks with bounding boxes
tables Text + table blocks
full Text + blocks + tables + images (default)

Required Inputs

Provide one of:

  • url — public URL to a PDF
  • Multipart upload with file field

Authentication

Send your API key in the CLIENT-API-KEY header.

Get your free API key at https://pdfapihub.com. Full API documentation is available at https://pdfapihub.com/docs.

Use Cases

  • Invoice Parsing — Extract line items, totals, and vendor info from PDF invoices
  • Resume Parsing — Extract structured data (name, experience, skills) from PDF resumes
  • Contract Analysis — Extract clauses, dates, and parties from legal PDF contracts
  • Form Data Extraction — Pull filled form fields and values from PDF forms
  • Research Paper Analysis — Extract text, tables, and figures from academic PDFs
  • Document Indexing — Parse PDFs into structured JSON for search engine indexing

Example Usage

curl -X POST https://pdfapihub.com/api/v1/pdf/parse \
  -H "CLIENT-API-KEY: your_api_key" \
  -H "Content-Type: application/json" \
  -d '{ "url": "https://pdfapihub.com/sample-pdfinvoice-with-image.pdf", "mode": "full", "pages": "1-3" }'
安全使用建议
This skill forwards PDFs to a third-party service (pdfapihub.com) and requires an API key in the CLIENT-API-KEY header. Before installing: (1) confirm you trust pdfapihub.com (review their privacy policy and retention practices); (2) avoid sending sensitive or regulated documents unless you’ve verified security/contractual protections; (3) supply the API key via the platform's secure credential storage (do not paste it into chat); (4) test with non-sensitive samples first; and (5) if you need stronger assurance, ask the skill author for a homepage, company identity, and privacy/terms links — the owner is currently unknown.
功能分析
Type: OpenClaw Skill Name: pdf-parse Version: 1.0.0 The skill bundle is a standard wrapper for an external PDF parsing API (pdfapihub.com). It functions as described, facilitating the extraction of text and layout data from PDFs via a REST API, and contains no evidence of malicious intent, obfuscation, or prompt injection.
能力标签
requires-sensitive-credentials
能力评估
Purpose & Capability
Name/description match the runtime instructions and example: the skill forwards PDFs (by URL or multipart upload) to pdfapihub.com for parsing and expects an API key in the CLIENT-API-KEY header — this is appropriate for a PDF parsing integration.
Instruction Scope
SKILL.md only instructs POSTing to https://pdfapihub.com/api/v1/pdf/parse with either a public URL or multipart file and an API key. That stays within the stated purpose, but the instructions do not warn that PDFs (which may contain sensitive PII or secrets) will be transmitted to a third-party service — a privacy/data-exfiltration risk users should be aware of.
Install Mechanism
Instruction-only skill with no install spec or code files; nothing is written to disk or downloaded by the skill itself, which minimizes installation risk.
Credentials
The skill requires an API key (header-based) according to SKILL.md and skill.json, but no required environment variables are declared in the registry metadata — this is not dangerous but is a minor inconsistency in how credentials are represented. Requesting an API key for the external service is proportionate to the functionality.
Persistence & Privilege
always:false and no install-time modifications or system paths requested. The skill does perform outbound network calls to pdfapihub.com when invoked (expected for its purpose).
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install pdf-parse
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /pdf-parse 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Parse PDFs into structured JSON with text, layout-aware blocks (with normalized bounding boxes), tables, and image metadata. Modes: text, layout, tables, full. Supports page selection.
元数据
Slug pdf-parse
版本 1.0.0
许可证 MIT-0
累计安装 1
当前安装数 1
历史版本数 1
常见问题

PDF Parse 是什么?

Parse a PDF into structured JSON: text, layout-aware blocks with bounding boxes, tables, and image metadata. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 91 次。

如何安装 PDF Parse?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install pdf-parse」即可一键安装,无需额外配置。

PDF Parse 是免费的吗?

是的,PDF Parse 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

PDF Parse 支持哪些平台?

PDF Parse 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 PDF Parse?

由 Rishabh Dugar(@rishabhdugar)开发并维护,当前版本 v1.0.0。

💬 留言讨论