← 返回 Skills 市场

PDF Read/Write Toolkit

Name: PDF Read/Write Toolkit
Author: droba07

作者 Roman Matyuschenko · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ✓ 安全检测通过

总下载

当前安装

版本数

在 OpenClaw 中安装

/install pdf-rw-toolkit

功能描述

Read, extract, and generate PDF files. Use when user asks to read PDF content, extract text/tables, merge PDFs, fill forms, or generate PDFs from HTML/Markdown.

使用说明 (SKILL.md)

PDF Skill

Read, extract, analyze, and generate PDF documents.

Capabilities

Extract text from PDF (full or per-page)
Extract tables from PDF as CSV/JSON
Get metadata (title, author, pages, etc.)
Merge multiple PDFs into one
Split PDF by page ranges
Generate PDF from HTML or Markdown
Fill PDF forms

Scripts

All scripts are in scripts/ relative to this skill directory.

Read / Extract

# Extract all text
python3 scripts/pdf_read.py \x3Cfile.pdf>

# Extract text from specific pages (1-indexed)
python3 scripts/pdf_read.py \x3Cfile.pdf> --pages 1,3,5-10

# Extract tables as CSV
python3 scripts/pdf_read.py \x3Cfile.pdf> --tables --format csv

# Extract tables as JSON
python3 scripts/pdf_read.py \x3Cfile.pdf> --tables --format json

# Get PDF metadata and page count
python3 scripts/pdf_read.py \x3Cfile.pdf> --info

Merge / Split

# Merge multiple PDFs
python3 scripts/pdf_merge.py output.pdf input1.pdf input2.pdf input3.pdf

# Split: extract specific pages
python3 scripts/pdf_split.py input.pdf output.pdf --pages 1,3,5-10

Generate

# Generate PDF from HTML file
python3 scripts/pdf_generate.py input.html output.pdf

# Generate PDF from HTML string
python3 scripts/pdf_generate.py --html "\x3Ch1>Hello\x3C/h1>\x3Cp>World\x3C/p>" output.pdf

# Generate PDF from Markdown (converted to HTML first)
python3 scripts/pdf_generate.py input.md output.pdf

Usage Notes

For large PDFs, use --pages to limit extraction scope
Table extraction works best with well-structured tables; complex layouts may need manual cleanup
PDF generation via WeasyPrint supports CSS styling — pass a --css file for custom styles
All paths can be absolute or relative to the workspace

安全使用建议

This skill appears to do what it says: run its Python scripts to read, split/merge, or generate PDFs. Before installing, ensure the host can install the declared Python packages (weasyprint often needs system libraries like Cairo). Treat PDFs as potentially sensitive—only point the skill at files you want processed, and run in a sandbox or restricted workspace if those PDFs contain secrets. Keep pdf-related libraries up to date because PDF parsers have historically had security vulnerabilities when processing hostile documents. Finally, note the small inconsistency that the registry had no install spec while SKILL.md lists pip deps — make sure those dependencies are available in your environment.

功能分析

Type: OpenClaw Skill Name: pdf-rw-toolkit Version: 1.0.0 The pdf-rw-toolkit skill bundle provides standard PDF manipulation capabilities including text/table extraction, merging, splitting, and generation from HTML/Markdown. The Python scripts (pdf_read.py, pdf_merge.py, pdf_split.py, and pdf_generate.py) use well-known libraries like pypdf, pdfplumber, and weasyprint to perform their stated functions without any evidence of malicious intent, data exfiltration, or suspicious execution patterns.

能力评估

✓ Purpose & Capability

Name/description match the included scripts and declared dependencies: pdfplumber and pypdf for reading/merging/splitting, weasyprint for generation. Requested binary (python3) is appropriate; no unrelated binaries or env vars are required.

✓ Instruction Scope

SKILL.md directs the agent to run local scripts on user-supplied files. The scripts only read input files, optionally a CSS file, and write output PDFs or print extracted text/tables to stdout. They do not reference external endpoints, other system configs, or environment variables.

ℹ Install Mechanism

The registry lists no formal install spec (instruction-only), but SKILL.md includes an openclaw.requires.pip list (pdfplumber, pypdf, weasyprint). This is coherent (the scripts need those packages) but slightly inconsistent with 'no install spec' in the registry — the environment will need those pip packages and weasyprint has native deps (Cairo/GTK) that may be required on the host.

✓ Credentials

The skill requires no credentials, config paths, or environment variables. It does not attempt to read environment data or secret files.

✓ Persistence & Privilege

always is false and the skill does not request persistent or system-wide modifications. Autonomous invocation is allowed by default (platform behavior) but the skill's actions remain limited to file I/O.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install pdf-rw-toolkit
安装完成后，直接呼叫该 Skill 的名称或使用 /pdf-rw-toolkit 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

Initial release: read, extract text/tables, merge, split, generate PDF

元数据

Slug pdf-rw-toolkit

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题