← Back to Skills Marketplace
kesslerio

PyMuPDF PDF Parser Clawdbot Skill

by kesslerio · GitHub ↗ · v1.0.0
cross-platform ✓ Security Clean
5382
Downloads
4
Stars
39
Active Installs
1
Versions
Install in OpenClaw
/install pymupdf-pdf-parser-clawdbot-skill
Description
Fast local PDF parsing with PyMuPDF (fitz) for Markdown/JSON outputs and optional images/tables. Use when speed matters more than robustness, or as a fallback while heavier parsers are unavailable. Default to single-PDF parsing with per-document output folders.
README (SKILL.md)

PyMuPDF PDF

Overview

Parse PDFs locally using PyMuPDF for fast, lightweight extraction into Markdown by default, with optional JSON and image/table outputs in a per-document directory.

Prereqs / when to read references

If you hit import errors (PyMuPDF not installed) or Nix libstdc++ issues, read:

  • references/pymupdf-notes.md

Quick start (single PDF)

# Run from the skill directory
./scripts/pymupdf_parse.py /path/to/file.pdf \
  --format md \
  --outroot ./pymupdf-output

Options

  • --format md|json|both (default: md)
  • --images to extract images
  • --tables to extract a simple line-based table JSON (quick/rough)
  • --outroot DIR to change output root
  • --lang adds a language hint into JSON output metadata

Output conventions

  • Create ./pymupdf-output/\x3Cpdf-basename>/ by default.
  • Markdown output: output.md
  • JSON output: output.json (includes lang)
  • Images: images/ subdir
  • Tables: tables.json (rough line-based)

Notes

  • PyMuPDF is fast but less robust on complex PDFs.
  • For more robust parsing, use a heavy-duty OCR parser (e.g., MinerU) if installed.
Usage Guidance
This skill appears safe for its stated local PDF parsing purpose. Use it with PDFs you intend to process, choose an output directory you are comfortable writing to, and install PyMuPDF from a trusted Python environment.
Capability Analysis
Type: OpenClaw Skill Name: Developer: Version: Description: OpenClaw Agent Skill The OpenClaw skill bundle provides a local PDF parsing utility using PyMuPDF. The `scripts/pymupdf_parse.py` script correctly implements the stated functionality, reading a PDF and writing extracted content (Markdown, JSON, images, tables) to a local output directory. There is no evidence of network communication, access to sensitive system files or environment variables, external command execution, or obfuscation. The `SKILL.md` and `README.md` files contain only descriptive information and standard usage/installation instructions, with no prompt injection attempts aiming for malicious actions.
Capability Assessment
Purpose & Capability
The README, SKILL.md, and script all align around locally parsing a user-provided PDF into Markdown, JSON, images, or simple table output.
Instruction Scope
Instructions are limited to user-directed parsing commands and documented options; there are no hidden autonomous workflows, credential requests, or unrelated actions.
Install Mechanism
The skill is listed as instruction-only but the README asks users to install PyMuPDF with an unpinned pip command. This is expected for the purpose, but users should install it from a trusted environment.
Credentials
Local PDF reads and local output writes are proportionate to the stated parsing purpose and are scoped to the provided PDF path and output directory.
Persistence & Privilege
The script creates output files and folders only; it does not request credentials, elevated privileges, background persistence, or ongoing access.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install pymupdf-pdf-parser-clawdbot-skill
  3. After installation, invoke the skill by name or use /pymupdf-pdf-parser-clawdbot-skill
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release of PyMuPDF PDF parsing skill. - Fast local PDF extraction using PyMuPDF (fitz) with Markdown output by default. - Supports optional JSON, image extraction, and simple table outputs. - Designed for single-PDF runs with per-document output directories. - Includes command-line options for output format, images, tables, and language metadata. - Recommended as a quick alternative or fallback when heavier PDF parsers are unavailable.
Metadata
Slug pymupdf-pdf-parser-clawdbot-skill
Version 1.0.0
License
All-time Installs 41
Active Installs 39
Total Versions 1
Frequently Asked Questions

What is PyMuPDF PDF Parser Clawdbot Skill?

Fast local PDF parsing with PyMuPDF (fitz) for Markdown/JSON outputs and optional images/tables. Use when speed matters more than robustness, or as a fallback while heavier parsers are unavailable. Default to single-PDF parsing with per-document output folders. It is an AI Agent Skill for Claude Code / OpenClaw, with 5382 downloads so far.

How do I install PyMuPDF PDF Parser Clawdbot Skill?

Run "/install pymupdf-pdf-parser-clawdbot-skill" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is PyMuPDF PDF Parser Clawdbot Skill free?

Yes, PyMuPDF PDF Parser Clawdbot Skill is completely free (open-source). You can download, install and use it at no cost.

Which platforms does PyMuPDF PDF Parser Clawdbot Skill support?

PyMuPDF PDF Parser Clawdbot Skill is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created PyMuPDF PDF Parser Clawdbot Skill?

It is built and maintained by kesslerio (@kesslerio); the current version is v1.0.0.

💬 Comments