← Back to Skills Marketplace
tktk-ai

Pdf Invoice Parser

by tktk-ai · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ Security Clean
142
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install pdf-invoice-parser
Description
Extract structured data from PDF invoices and documents. Handles scanned PDFs (OCR) and digital PDFs. Outputs clean CSV/Excel with vendor, invoice number, da...
Usage Guidance
The skill appears coherent and limited to local PDF parsing, but follow best practices before running it on sensitive data: (1) Run in a virtualenv or container to isolate pip-installed packages; (2) review and pin dependency versions if you will install them on production systems; (3) install tesseract from your OS package manager as instructed (verify the source); (4) test on non-sensitive sample invoices to confirm parsing quality; and (5) if you need network or cloud integration later, prefer adding explicit, minimal credentials and review any new code for unexpected network activity. Overall this skill is fit-for-purpose but exercise standard supply-chain caution when installing third-party Python packages.
Capability Analysis
Type: OpenClaw Skill Name: pdf-invoice-parser Version: 1.0.0 The skill bundle is a legitimate tool for extracting structured data from PDF invoices using standard libraries such as PyMuPDF, PyPDF2, and Tesseract OCR. The scripts (parse-invoice.py and parse-invoices.py) perform local file processing and regex-based data extraction as described, with no evidence of data exfiltration, malicious command execution, or prompt injection attacks.
Capability Tags
cryptocan-make-purchases
Capability Assessment
Purpose & Capability
Name/description match the included scripts and declared functionality: parsing searchable PDFs, optional OCR via pytesseract, and writing CSV/JSON/Excel-ready output. Required libraries (PyMuPDF, PyPDF2, Pillow, pytesseract, openpyxl) are appropriate for the stated purpose.
Instruction Scope
SKILL.md instructs the agent/user to install dependencies and run the provided scripts on local PDF files or directories. The runtime instructions and the scripts operate only on user-provided PDFs and produce local output files; they do not attempt to read unrelated system files, environment variables, or contact external endpoints.
Install Mechanism
This is an instruction-only skill (no automated install spec). SKILL.md asks the user to pip install third-party packages and to install the tesseract system package via apt/brew. Installing packages via pip can execute arbitrary code during installation (normal for Python packages) — recommend using a virtualenv/container and verifying package sources. The pip flag --break-system-packages appears in the example; it's not harmful in itself but is uncommon and may be unnecessary for many users.
Credentials
The skill requests no environment variables, no credentials, and no config paths. All data access is limited to PDF files supplied by the user. There are no hidden credential usages in the code.
Persistence & Privilege
always is false and the skill does not modify other skills or system-wide agent settings. It does not persist credentials or enable itself automatically.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install pdf-invoice-parser
  3. After installation, invoke the skill by name or use /pdf-invoice-parser
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release of pdf-invoice-parser. - Extracts structured data from digital and scanned PDF invoices. - Supports OCR for scanned/image-based PDFs. - Outputs invoice data as CSV, JSON, or Excel-ready TSV formats. - Captures key fields: vendor, invoice number, dates, line items, totals, and currency. - Supports batch processing of invoice directories. - Includes CLI usage examples and required dependencies for setup.
Metadata
Slug pdf-invoice-parser
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Pdf Invoice Parser?

Extract structured data from PDF invoices and documents. Handles scanned PDFs (OCR) and digital PDFs. Outputs clean CSV/Excel with vendor, invoice number, da... It is an AI Agent Skill for Claude Code / OpenClaw, with 142 downloads so far.

How do I install Pdf Invoice Parser?

Run "/install pdf-invoice-parser" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Pdf Invoice Parser free?

Yes, Pdf Invoice Parser is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Pdf Invoice Parser support?

Pdf Invoice Parser is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Pdf Invoice Parser?

It is built and maintained by tktk-ai (@tktk-ai); the current version is v1.0.0.

💬 Comments