← Back to Skills Marketplace
cap-txt

PR's PDF Agent

by cap-txt · GitHub ↗ · v0.1.0
cross-platform ⚠ suspicious
382
Downloads
2
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install pdfagent
Description
Self-hosted PDF operations and conversions with metered usage output.
Usage Guidance
This package implements a comprehensive self-hosted PDF CLI (merging, splitting, OCR, conversions, redaction, an 'agent' mode) and the code mostly matches that purpose — but pay attention to these issues before running it: - Dependency mismatch: The registry lists no required binaries/env but SKILL.md and code expect many external tools (gs, qpdf, pdftoppm, soffice, ocrmypdf, wkhtmltopdf/Chrome, and optionally ollama). Ensure those are installed intentionally. - Network and external execution: html_to_pdf can fetch remote URLs; core.llm can run arbitrary commands or call 'ollama' (a local LLM runner). Running the CLI with remote sources or LLM provider=command may cause the tool to access the network or execute untrusted commands. Treat any use that passes URLs or enables an external LLM/command as potentially exfiltrative. - Undeclared env usage: The code reads PDFAGENT_SOFFICE_TIMEOUT (and subprocess code supports custom env). Review environment variables and avoid exposing secrets to the runtime environment you use for this tool. - Run in isolation first: Test the tool in a sandbox / disposable VM, with non-sensitive PDFs, and confirm behavior (doctor command reports available binaries). Inspect CLI flags (especially anything enabling LLM/agent mode or remote fetching) before using on private data. - Origin and trust: The source 'homepage' and origin are unknown. If you need to run this in production or on sensitive documents, consider auditing the remaining omitted files, or prefer a vetted implementation from a known source. If you want, I can: (1) list every place the code can perform network I/O or spawn external processes, (2) locate where the CLI accepts LLM provider/command options, or (3) highlight any remaining omitted files for further review.
Capability Analysis
Type: OpenClaw Skill Name: pdfagent Version: 0.1.0 The skill bundle provides extensive PDF manipulation capabilities but contains high-risk features that could be exploited via prompt injection. Specifically, the 'agent' and 'translate' commands in 'pdfagent/cli.py' allow for arbitrary command execution through the '--llm-cmd' parameter (processed in 'pdfagent/core/llm.py'), which is intended for local LLM integration but lacks sanitization against malicious instructions. Additionally, 'pdfagent/tools/html_to_pdf.py' uses 'urllib.request.urlopen' to fetch content from user-provided URLs, introducing a potential Server-Side Request Forgery (SSRF) risk.
Capability Assessment
Purpose & Capability
Name/description promise self-hosted PDF operations and the repo code implements that. However the skill metadata declares no required binaries or env vars while SKILL.md and the code require/expect uv, Ghostscript (gs), qpdf, poppler (pdftoppm), soffice (LibreOffice), ocrmypdf, wkhtmltopdf/Chrome, and optionally ollama and other Python libs. The registry declarations (no requirements) are inconsistent with the actual capabilities and dependencies.
Instruction Scope
SKILL.md focuses on local disk-based PDF processing, but the code can fetch remote HTML (urllib.request.urlopen in html_to_pdf) and can invoke external commands/LLM providers (core.llm uses arbitrary commands or 'ollama' via subprocess). Those behaviors allow network I/O and arbitrary process execution that go beyond simple file manipulation; the documentation does mention some of these tools but the risk/implications are not made explicit in the SKILL.md.
Install Mechanism
No install spec is provided (instruction-only for running via 'uv run'), so nothing is downloaded or installed automatically by the registry. The presence of source files means code will execute locally when run, but there is no remote installer or archive URL to review.
Credentials
The registry declares no required env vars, but code reads at least one env var (PDFAGENT_SOFFICE_TIMEOUT) and the subprocess execution paths allow passing custom env to commands. The tool also exposes options to call external LLMs or arbitrary commands; those uses can require secrets or expose sensitive data if misconfigured. Overall requested/used environment access is under-declared relative to what the code can leverage.
Persistence & Privilege
The skill is not always-enabled, does not request to modify other skills, and has no install hook. It writes usage logs optionally to a --usage-file, creates per-command output files and local LibreOffice profile directories, which is normal for a CLI tool.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install pdfagent
  3. After installation, invoke the skill by name or use /pdfagent
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v0.1.0
Initial release of pdfagent: self-hosted PDF operations with usage metering. - Supports PDF merge, split, compression, conversion (including image to PDF), and OCR. - Designed for local file processing; inputs and outputs remain on disk. - Provides detailed, machine-readable output with usage statistics via --json. - Flexible agent mode for multi-step PDF instruction execution. - Includes dependency and system binary checking for robust setup. - Runs standalone from source with `uv run`; no PyPI publishing required.
Metadata
Slug pdfagent
Version 0.1.0
License
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is PR's PDF Agent?

Self-hosted PDF operations and conversions with metered usage output. It is an AI Agent Skill for Claude Code / OpenClaw, with 382 downloads so far.

How do I install PR's PDF Agent?

Run "/install pdfagent" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is PR's PDF Agent free?

Yes, PR's PDF Agent is completely free (open-source). You can download, install and use it at no cost.

Which platforms does PR's PDF Agent support?

PR's PDF Agent is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created PR's PDF Agent?

It is built and maintained by cap-txt (@cap-txt); the current version is v0.1.0.

💬 Comments