← 返回 Skills 市场
kesslerio

PyMuPDF PDF Parser Clawdbot Skill

作者 kesslerio · GitHub ↗ · v1.0.0
cross-platform ✓ 安全检测通过
5382
总下载
4
收藏
39
当前安装
1
版本数
在 OpenClaw 中安装
/install pymupdf-pdf-parser-clawdbot-skill
功能描述
Fast local PDF parsing with PyMuPDF (fitz) for Markdown/JSON outputs and optional images/tables. Use when speed matters more than robustness, or as a fallback while heavier parsers are unavailable. Default to single-PDF parsing with per-document output folders.
使用说明 (SKILL.md)

PyMuPDF PDF

Overview

Parse PDFs locally using PyMuPDF for fast, lightweight extraction into Markdown by default, with optional JSON and image/table outputs in a per-document directory.

Prereqs / when to read references

If you hit import errors (PyMuPDF not installed) or Nix libstdc++ issues, read:

  • references/pymupdf-notes.md

Quick start (single PDF)

# Run from the skill directory
./scripts/pymupdf_parse.py /path/to/file.pdf \
  --format md \
  --outroot ./pymupdf-output

Options

  • --format md|json|both (default: md)
  • --images to extract images
  • --tables to extract a simple line-based table JSON (quick/rough)
  • --outroot DIR to change output root
  • --lang adds a language hint into JSON output metadata

Output conventions

  • Create ./pymupdf-output/\x3Cpdf-basename>/ by default.
  • Markdown output: output.md
  • JSON output: output.json (includes lang)
  • Images: images/ subdir
  • Tables: tables.json (rough line-based)

Notes

  • PyMuPDF is fast but less robust on complex PDFs.
  • For more robust parsing, use a heavy-duty OCR parser (e.g., MinerU) if installed.
安全使用建议
This skill appears safe for its stated local PDF parsing purpose. Use it with PDFs you intend to process, choose an output directory you are comfortable writing to, and install PyMuPDF from a trusted Python environment.
功能分析
Type: OpenClaw Skill Name: Developer: Version: Description: OpenClaw Agent Skill The OpenClaw skill bundle provides a local PDF parsing utility using PyMuPDF. The `scripts/pymupdf_parse.py` script correctly implements the stated functionality, reading a PDF and writing extracted content (Markdown, JSON, images, tables) to a local output directory. There is no evidence of network communication, access to sensitive system files or environment variables, external command execution, or obfuscation. The `SKILL.md` and `README.md` files contain only descriptive information and standard usage/installation instructions, with no prompt injection attempts aiming for malicious actions.
能力评估
Purpose & Capability
The README, SKILL.md, and script all align around locally parsing a user-provided PDF into Markdown, JSON, images, or simple table output.
Instruction Scope
Instructions are limited to user-directed parsing commands and documented options; there are no hidden autonomous workflows, credential requests, or unrelated actions.
Install Mechanism
The skill is listed as instruction-only but the README asks users to install PyMuPDF with an unpinned pip command. This is expected for the purpose, but users should install it from a trusted environment.
Credentials
Local PDF reads and local output writes are proportionate to the stated parsing purpose and are scoped to the provided PDF path and output directory.
Persistence & Privilege
The script creates output files and folders only; it does not request credentials, elevated privileges, background persistence, or ongoing access.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install pymupdf-pdf-parser-clawdbot-skill
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /pymupdf-pdf-parser-clawdbot-skill 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release of PyMuPDF PDF parsing skill. - Fast local PDF extraction using PyMuPDF (fitz) with Markdown output by default. - Supports optional JSON, image extraction, and simple table outputs. - Designed for single-PDF runs with per-document output directories. - Includes command-line options for output format, images, tables, and language metadata. - Recommended as a quick alternative or fallback when heavier PDF parsers are unavailable.
元数据
Slug pymupdf-pdf-parser-clawdbot-skill
版本 1.0.0
许可证
累计安装 41
当前安装数 39
历史版本数 1
常见问题

PyMuPDF PDF Parser Clawdbot Skill 是什么?

Fast local PDF parsing with PyMuPDF (fitz) for Markdown/JSON outputs and optional images/tables. Use when speed matters more than robustness, or as a fallback while heavier parsers are unavailable. Default to single-PDF parsing with per-document output folders. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 5382 次。

如何安装 PyMuPDF PDF Parser Clawdbot Skill?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install pymupdf-pdf-parser-clawdbot-skill」即可一键安装,无需额外配置。

PyMuPDF PDF Parser Clawdbot Skill 是免费的吗?

是的,PyMuPDF PDF Parser Clawdbot Skill 完全免费(开源免费),可自由下载、安装和使用。

PyMuPDF PDF Parser Clawdbot Skill 支持哪些平台?

PyMuPDF PDF Parser Clawdbot Skill 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 PyMuPDF PDF Parser Clawdbot Skill?

由 kesslerio(@kesslerio)开发并维护,当前版本 v1.0.0。

💬 留言讨论