← Back to Skills Marketplace
droba07

PDF Read/Write Toolkit

by Roman Matyuschenko · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ Security Clean
91
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install pdf-rw-toolkit
Description
Read, extract, and generate PDF files. Use when user asks to read PDF content, extract text/tables, merge PDFs, fill forms, or generate PDFs from HTML/Markdown.
README (SKILL.md)

PDF Skill

Read, extract, analyze, and generate PDF documents.

Capabilities

  • Extract text from PDF (full or per-page)
  • Extract tables from PDF as CSV/JSON
  • Get metadata (title, author, pages, etc.)
  • Merge multiple PDFs into one
  • Split PDF by page ranges
  • Generate PDF from HTML or Markdown
  • Fill PDF forms

Scripts

All scripts are in scripts/ relative to this skill directory.

Read / Extract

# Extract all text
python3 scripts/pdf_read.py \x3Cfile.pdf>

# Extract text from specific pages (1-indexed)
python3 scripts/pdf_read.py \x3Cfile.pdf> --pages 1,3,5-10

# Extract tables as CSV
python3 scripts/pdf_read.py \x3Cfile.pdf> --tables --format csv

# Extract tables as JSON
python3 scripts/pdf_read.py \x3Cfile.pdf> --tables --format json

# Get PDF metadata and page count
python3 scripts/pdf_read.py \x3Cfile.pdf> --info

Merge / Split

# Merge multiple PDFs
python3 scripts/pdf_merge.py output.pdf input1.pdf input2.pdf input3.pdf

# Split: extract specific pages
python3 scripts/pdf_split.py input.pdf output.pdf --pages 1,3,5-10

Generate

# Generate PDF from HTML file
python3 scripts/pdf_generate.py input.html output.pdf

# Generate PDF from HTML string
python3 scripts/pdf_generate.py --html "\x3Ch1>Hello\x3C/h1>\x3Cp>World\x3C/p>" output.pdf

# Generate PDF from Markdown (converted to HTML first)
python3 scripts/pdf_generate.py input.md output.pdf

Usage Notes

  • For large PDFs, use --pages to limit extraction scope
  • Table extraction works best with well-structured tables; complex layouts may need manual cleanup
  • PDF generation via WeasyPrint supports CSS styling — pass a --css file for custom styles
  • All paths can be absolute or relative to the workspace
Usage Guidance
This skill appears to do what it says: run its Python scripts to read, split/merge, or generate PDFs. Before installing, ensure the host can install the declared Python packages (weasyprint often needs system libraries like Cairo). Treat PDFs as potentially sensitive—only point the skill at files you want processed, and run in a sandbox or restricted workspace if those PDFs contain secrets. Keep pdf-related libraries up to date because PDF parsers have historically had security vulnerabilities when processing hostile documents. Finally, note the small inconsistency that the registry had no install spec while SKILL.md lists pip deps — make sure those dependencies are available in your environment.
Capability Analysis
Type: OpenClaw Skill Name: pdf-rw-toolkit Version: 1.0.0 The pdf-rw-toolkit skill bundle provides standard PDF manipulation capabilities including text/table extraction, merging, splitting, and generation from HTML/Markdown. The Python scripts (pdf_read.py, pdf_merge.py, pdf_split.py, and pdf_generate.py) use well-known libraries like pypdf, pdfplumber, and weasyprint to perform their stated functions without any evidence of malicious intent, data exfiltration, or suspicious execution patterns.
Capability Assessment
Purpose & Capability
Name/description match the included scripts and declared dependencies: pdfplumber and pypdf for reading/merging/splitting, weasyprint for generation. Requested binary (python3) is appropriate; no unrelated binaries or env vars are required.
Instruction Scope
SKILL.md directs the agent to run local scripts on user-supplied files. The scripts only read input files, optionally a CSS file, and write output PDFs or print extracted text/tables to stdout. They do not reference external endpoints, other system configs, or environment variables.
Install Mechanism
The registry lists no formal install spec (instruction-only), but SKILL.md includes an openclaw.requires.pip list (pdfplumber, pypdf, weasyprint). This is coherent (the scripts need those packages) but slightly inconsistent with 'no install spec' in the registry — the environment will need those pip packages and weasyprint has native deps (Cairo/GTK) that may be required on the host.
Credentials
The skill requires no credentials, config paths, or environment variables. It does not attempt to read environment data or secret files.
Persistence & Privilege
always is false and the skill does not request persistent or system-wide modifications. Autonomous invocation is allowed by default (platform behavior) but the skill's actions remain limited to file I/O.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install pdf-rw-toolkit
  3. After installation, invoke the skill by name or use /pdf-rw-toolkit
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release: read, extract text/tables, merge, split, generate PDF
Metadata
Slug pdf-rw-toolkit
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is PDF Read/Write Toolkit?

Read, extract, and generate PDF files. Use when user asks to read PDF content, extract text/tables, merge PDFs, fill forms, or generate PDFs from HTML/Markdown. It is an AI Agent Skill for Claude Code / OpenClaw, with 91 downloads so far.

How do I install PDF Read/Write Toolkit?

Run "/install pdf-rw-toolkit" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is PDF Read/Write Toolkit free?

Yes, PDF Read/Write Toolkit is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does PDF Read/Write Toolkit support?

PDF Read/Write Toolkit is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created PDF Read/Write Toolkit?

It is built and maintained by Roman Matyuschenko (@droba07); the current version is v1.0.0.

💬 Comments