← Back to Skills Marketplace

Pdfreader

Name: Pdfreader
Author: nantes

by Ivan Cetta · GitHub ↗ · v1.0.3

cross-platform ✓ Security Clean

643

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install pdfreader

Description

Extract text and metadata from PDF files using PyMuPDF, supporting large files and outputting results in JSON format.

Usage Guidance

This skill appears to do what it claims: extract text and metadata from PDFs using PyMuPDF. Before installing or running it, consider: 1) Run pip install pymupdf in an isolated environment (virtualenv/container) — PyMuPDF includes compiled code from PyPI. 2) The script enforces 'within current working directory' but allows subdirectories and does not resolve symlinks; avoid placing untrusted symlinks inside the working directory to prevent escapes. 3) Because the source/homepage is unknown, prefer running the script in a sandbox and review the code yourself (or run it on non-sensitive PDFs) before giving it access to important files. If you need stricter confinement (no subdirectories or symlink protections), request a code change to use os.path.realpath checks and a configurable safe directory.

Capability Analysis

Type: OpenClaw Skill Name: pdfreader Version: 1.0.3 The OpenClaw skill bundle is designed to extract text from PDF files using PyMuPDF. The `SKILL.md` documentation provides clear, non-malicious instructions and explicitly states security restrictions. The `pdf_reader.py` script implements robust path validation (`is_safe_input_path`, `is_safe_output_path`) to prevent path traversal and restrict file operations to the current working directory and specific file types (.pdf for input, .json for output). There are no signs of data exfiltration, malicious execution, persistence, or prompt injection attempts against the agent. The code is well-contained and aligns with its stated purpose and security measures.

Capability Assessment

✓ Purpose & Capability

Name/description match the files and instructions. The code uses PyMuPDF (fitz) to open PDFs, extract text and metadata, and produce JSON — exactly what the description promises. No extraneous binaries, credentials, or services are requested.

ℹ Instruction Scope

SKILL.md usage aligns with the script's behavior (pip install pymupdf; run python pdf_reader.py ...). The SKILL.md states files must be 'within the current working directory' and forbids '../' traversal; the script enforces that by checking absolute paths are inside os.getcwd(). However, the script allows files in subdirectories of the current working directory (contrary to an implication that only the top-level cwd is allowed) and uses os.path.abspath rather than realpath, so a symlink inside the cwd that points outside could bypass the directory restriction. This is an implementation caveat rather than evidence of malicious behavior.

✓ Install Mechanism

No install spec is embedded (instruction-only install guidance in SKILL.md recommends 'pip install pymupdf'). That is low-risk from the skill bundle perspective. Note: installing PyMuPDF via pip will run compiled extension code from PyPI — treat pip installs from unknown sources with standard care.

✓ Credentials

The skill requests no environment variables, credentials, or config paths. The functionality does not require additional secrets. The code does not read environment variables or access unrelated system configuration.

✓ Persistence & Privilege

always is false and the skill does not request persistent/autoincluded privileges. It does not modify other skills or system-wide settings. Autonomous invocation remains the platform default but is not combined with other concerning privileges here.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install pdfreader
After installation, invoke the skill by name or use /pdfreader
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.3

Fixed instruction mismatch: Separated input (.pdf) and output (.json) validation. Added security documentation to SKILL.md

v1.0.2

Security fix: Added .pdf extension validation to prevent arbitrary file read (CVE-like vulnerability)

v1.0.1

Security fix: Added path validation to prevent arbitrary file write (CVE-like vulnerability)

v1.0.0

Initial release of PDF Reader Skill for OpenClaw: - Extracts text from any PDF using PyMuPDF. - Supports large and multi-page PDF files. - Outputs extracted content in JSON for AI reading compatibility. - Handles text encoding issues. - Displays PDF metadata (title, author, etc.). - Includes clear installation and usage instructions.

Metadata

Slug pdfreader

Version 1.0.3

License —

All-time Installs 4

Active Installs 4

Total Versions 4

Frequently Asked Questions

What is Pdfreader?

Extract text and metadata from PDF files using PyMuPDF, supporting large files and outputting results in JSON format. It is an AI Agent Skill for Claude Code / OpenClaw, with 643 downloads so far.

How do I install Pdfreader?

Run "/install pdfreader" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Pdfreader free?

Yes, Pdfreader is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Pdfreader support?

Pdfreader is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Pdfreader?

It is built and maintained by Ivan Cetta (@nantes); the current version is v1.0.3.

More Skills