← 返回 Skills 市场
pdf-skill
作者
weaglewang
· GitHub ↗
· v1.0.0
· MIT-0
755
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install pdf-skill
功能描述
Create, read, edit, merge, split PDF files. Supports text extraction, table extraction, form filling, watermarks, OCR, and HTML-to-PDF conversion.
使用说明 (SKILL.md)
PDF Skill v2.0
Overview
Complete PDF processing using pypdf, pdfplumber, weasyprint, and command-line tools. Supports all common PDF operations.
Installation & Dependencies
Required
pip install pypdf pdfplumber weasyprint
Optional
# For image conversion
brew install poppler
pip install pdf2image
# For OCR
pip install pytesseract
brew install tesseract
# For form filling
pip install pypdf-forms
Quick Start
Read PDF
from pypdf import PdfReader
reader = PdfReader("document.pdf")
print(f"Pages: {len(reader.pages)}")
# Extract text from first page
page = reader.pages[0]
text = page.extract_text()
print(text)
Merge PDFs
from pypdf import PdfWriter, PdfReader
writer = PdfWriter()
for pdf in ["doc1.pdf", "doc2.pdf"]:
reader = PdfReader(pdf)
for page in reader.pages:
writer.add_page(page)
with open("merged.pdf", "wb") as f:
writer.write(f)
Create PDF from HTML
from weasyprint import HTML
HTML(string="\x3Ch1>Hello PDF\x3C/h1>").write_pdf("output.pdf")
Complete API Reference
Reading PDFs
from pypdf import PdfReader
# Basic read
reader = PdfReader("document.pdf")
# Get page count
num_pages = len(reader.pages)
# Extract text from all pages
full_text = ""
for page in reader.pages:
full_text += page.extract_text()
# Extract text from specific page
page = reader.pages[5] # Page 6 (0-indexed)
text = page.extract_text()
# Get metadata
meta = reader.metadata
print(f"Title: {meta.title}")
print(f"Author: {meta.author}")
print(f"Subject: {meta.subject}")
print(f"Creator: {meta.creator}")
print(f"Creation Date: {meta.creation_date}")
# Get outline/bookmarks
outline = reader.outline
for item in outline:
print(item.title)
# Check if encrypted
if reader.is_encrypted:
reader.decrypt("password")
Extracting Tables
import pdfplumber
import pandas as pd
# Open PDF
with pdfplumber.open("document.pdf") as pdf:
# Get page count
print(f"Pages: {len(pdf.pages)}")
# Extract tables from first page
page = pdf.pages[0]
tables = page.extract_tables()
for i, table in enumerate(tables):
print(f"Table {i+1}:")
for row in table:
print(row)
# Convert all tables to Excel
all_tables = []
with pdfplumber.open("document.pdf") as pdf:
for page in pdf.pages:
tables = page.extract_tables()
for table in tables:
if table and len(table) > 1:
df = pd.DataFrame(table[1:], columns=table[0])
all_tables.append(df)
# Combine and export
if all_tables:
combined = pd.concat(all_tables, ignore_index=True)
combined.to_excel("extracted_tables.xlsx", index=False)
Merging PDFs
from pypdf import PdfWriter, PdfReader
# Merge multiple files
def merge_pdfs(input_files, output_file):
writer = PdfWriter()
for pdf_file in input_files:
reader = PdfReader(pdf_file)
print(f"Adding {pdf_file} ({len(reader.pages)} pages)")
for page in reader.pages:
writer.add_page(page)
with open(output_file, "wb") as f:
writer.write(f)
print(f"✓ Merged {len(input_files)} files into {output_file}")
# Usage
merge_pdfs(["doc1.pdf", "doc2.pdf", "doc3.pdf"], "merged.pdf")
Splitting PDFs
from pypdf import PdfReader, PdfWriter
# Split into individual pages
def split_pdf(input_file, output_prefix):
reader = PdfReader(input_file)
for i, page in enumerate(reader.pages):
writer = PdfWriter()
writer.add_page(page)
output_file = f"{output_prefix}_page_{i+1}.pdf"
with open(output_file, "wb") as f:
writer.write(f)
print(f"✓ Split into {len(reader.pages)} files")
# Extract specific pages
def extract_pages(input_file, output_file, page_numbers):
"""Extract specific pages (1-indexed)"""
reader = PdfReader(input_file)
writer = PdfWriter()
for page_num in page_numbers:
writer.add_page(reader.pages[page_num - 1])
with open(output_file, "wb") as f:
writer.write(f)
print(f"✓ Extracted pages {page_numbers}")
# Usage
split_pdf("document.pdf", "page")
extract_pages("document.pdf", "selected.pdf", [1, 3, 5])
Rotating Pages
from pypdf import PdfReader, PdfWriter
# Rotate all pages
def rotate_pdf(input_file, output_file, rotation=90):
reader = PdfReader(input_file)
writer = PdfWriter()
for page in reader.pages:
page.rotate(rotation) # 90, 180, or 270
writer.add_page(page)
with open(output_file, "wb") as f:
writer.write(f)
# Rotate specific page
reader = PdfReader("input.pdf")
writer = PdfWriter()
for i, page in enumerate(reader.pages):
if i == 0: # Rotate first page only
page.rotate(90)
writer.add_page(page)
with open("output.pdf", "wb") as f:
writer.write(f)
Creating PDFs from HTML
from weasyprint import HTML, CSS
# Basic HTML to PDF
html_content = """
\x3C!DOCTYPE html>
\x3Chtml>
\x3Chead>
\x3Cmeta charset="utf-8">
\x3Cstyle>
@page { size: A4; margin: 2cm; }
body { font-family: Arial, sans-serif; }
h1 { color: #333; }
\x3C/style>
\x3C/head>
\x3Cbody>
\x3Ch1>Hello PDF\x3C/h1>
\x3Cp>This is a test document.\x3C/p>
\x3C/body>
\x3C/html>
"""
HTML(string=html_content).write_pdf("output.pdf")
# With external CSS
HTML(
string="\x3Ch1>Styled PDF\x3C/h1>",
url_stylesheet="style.css"
).write_pdf("styled.pdf")
# Custom page size
HTML(string="\x3Ch1>Landscape\x3C/h1>").write_pdf(
"landscape.pdf",
stylesheets=[CSS(string="@page { size: landscape; }")]
)
# Add header/footer
html_with_pagenum = """
\x3C!DOCTYPE html>
\x3Chtml>
\x3Chead>
\x3Cstyle>
@page {
size: A4;
margin: 2cm;
@bottom-center {
content: "Page " counter(page) " of " counter(pages);
}
}
\x3C/style>
\x3C/head>
\x3Cbody>
\x3Ch1>Document with Page Numbers\x3C/h1>
\x3C!-- Content here -->
\x3C/body>
\x3C/html>
"""
Adding Watermarks
from pypdf import PdfReader, PdfWriter
from io import BytesIO
from reportlab.pdfgen import canvas
def create_watermark(text, output_path):
"""Create a watermark PDF"""
packet = BytesIO()
c = canvas.Canvas(packet)
# Draw text
c.saveState()
c.translate(300, 400)
c.rotate(45)
c.setFont("Helvetica-Bold", 50)
c.setFillColorRGB(0.5, 0.5, 0.5, 0.3) # Gray with transparency
c.drawCentredString(0, 0, text)
c.restoreState()
c.save()
packet.seek(0)
return PdfReader(packet)
# Apply watermark
def watermark_pdf(input_file, output_file, watermark_text):
reader = PdfReader(input_file)
watermark = create_watermark(watermark_text, "temp.pdf")
watermark_page = watermark.pages[0]
writer = PdfWriter()
for page in reader.pages:
page.merge_page(watermark_page)
writer.add_page(page)
with open(output_file, "wb") as f:
writer.write(f)
# Usage
watermark_pdf("document.pdf", "watermarked.pdf", "CONFIDENTIAL")
Adding Password Protection
from pypdf import PdfReader, PdfWriter
# Encrypt PDF
def protect_pdf(input_file, output_file, password):
reader = PdfReader(input_file)
writer = PdfWriter()
for page in reader.pages:
writer.add_page(page)
# Add password
writer.encrypt(password)
with open(output_file, "wb") as f:
writer.write(f)
# Usage
protect_pdf("document.pdf", "protected.pdf", "secretpassword")
# Decrypt PDF
reader = PdfReader("protected.pdf")
if reader.is_encrypted:
reader.decrypt("secretpassword")
Form Filling
from pypdf import PdfReader, PdfWriter
def fill_form(input_pdf, output_pdf, data):
"""
data: dict with field names and values
"""
reader = PdfReader(input_pdf)
writer = PdfWriter()
# Get form fields
fields = reader.get_form_text_fields()
print("Available fields:", list(fields.keys()))
# Fill fields
writer.append_pages_from_reader(reader)
writer.update_page_form_field_values(
writer.pages[0],
data
)
with open(output_pdf, "wb") as f:
writer.write(f)
# Usage
form_data = {
'name': 'John Doe',
'email': '[email protected]',
'date': '2024-01-15'
}
fill_form("form.pdf", "filled_form.pdf", form_data)
OCR for Scanned PDFs
import pytesseract
from pdf2image import convert_from_path
def ocr_pdf(input_pdf, output_text):
"""Extract text from scanned PDF using OCR"""
# Convert PDF to images
images = convert_from_path(input_pdf, dpi=300)
full_text = ""
for i, image in enumerate(images):
print(f"Processing page {i+1}...")
text = pytesseract.image_to_string(image, lang='eng')
full_text += f"\
--- Page {i+1} ---\
{text}"
with open(output_text, 'w', encoding='utf-8') as f:
f.write(full_text)
print(f"✓ OCR complete: {output_text}")
# Usage
ocr_pdf("scanned.pdf", "extracted_text.txt")
# Create searchable PDF
def make_searchable(input_pdf, output_pdf):
"""Create searchable PDF from scanned document"""
from pypdf import PdfReader, PdfWriter
images = convert_from_path(input_pdf, dpi=300)
writer = PdfWriter()
for image in images:
# This requires additional libraries for proper OCR PDF creation
pass
print("Searchable PDF creation requires additional setup")
Complete Examples
Example 1: PDF Report Generator
from weasyprint import HTML
from datetime import datetime
def generate_report(data, output_file):
"""Generate PDF report from data"""
html = f"""
\x3C!DOCTYPE html>
\x3Chtml>
\x3Chead>
\x3Cmeta charset="utf-8">
\x3Cstyle>
@page {{ size: A4; margin: 2cm; }}
@bottom-center {{
content: "Page " counter(page) " of " counter(pages);
font-size: 10px;
}}
body {{ font-family: Arial, sans-serif; }}
h1 {{ color: #2c3e50; border-bottom: 2px solid #3498db; }}
h2 {{ color: #34495e; }}
table {{ width: 100%; border-collapse: collapse; margin: 20px 0; }}
th, td {{ border: 1px solid #ddd; padding: 10px; text-align: left; }}
th {{ background-color: #3498db; color: white; }}
tr:nth-child(even) {{ background-color: #f2f2f2; }}
.header {{ text-align: center; margin-bottom: 40px; }}
.date {{ color: #7f8c8d; }}
\x3C/style>
\x3C/head>
\x3Cbody>
\x3Cdiv class="header">
\x3Ch1>Monthly Report\x3C/h1>
\x3Cp class="date">Generated: {datetime.now().strftime('%Y-%m-%d %H:%M')}\x3C/p>
\x3C/div>
\x3Ch2>Summary\x3C/h2>
\x3Ctable>
\x3Ctr>\x3Cth>Metric\x3C/th>\x3Cth>Value\x3C/th>\x3C/tr>
\x3Ctr>\x3Ctd>Total Sales\x3C/td>\x3Ctd>${data['total_sales']:,.2f}\x3C/td>\x3C/tr>
\x3Ctr>\x3Ctd>Orders\x3C/td>\x3Ctd>{data['orders']}\x3C/td>\x3C/tr>
\x3Ctr>\x3Ctd>Customers\x3C/td>\x3Ctd>{data['customers']}\x3C/td>\x3C/tr>
\x3C/table>
\x3Ch2>Details\x3C/h2>
\x3Ctable>
\x3Ctr>\x3Cth>Product\x3C/th>\x3Cth>Units\x3C/th>\x3Cth>Revenue\x3C/th>\x3C/tr>
"""
for product in data['products']:
html += f"""
\x3Ctr>
\x3Ctd>{product['name']}\x3C/td>
\x3Ctd>{product['units']}\x3C/td>
\x3Ctd>${product['revenue']:,.2f}\x3C/td>
\x3C/tr>
"""
html += """
\x3C/table>
\x3C/body>
\x3C/html>
"""
HTML(string=html).write_pdf(output_file)
print(f"✓ Report generated: {output_file}")
# Usage
data = {
'total_sales': 125000,
'orders': 450,
'customers': 320,
'products': [
{'name': 'Product A', 'units': 100, 'revenue': 50000},
{'name': 'Product B', 'units': 150, 'revenue': 75000}
]
}
generate_report(data, "monthly-report.pdf")
Example 2: PDF Merger Tool
from pypdf import PdfReader, PdfWriter
import os
class PDFMerger:
def __init__(self):
self.writer = PdfWriter()
self.bookmarks = []
def add_pdf(self, pdf_path, bookmark=None):
"""Add PDF to merger"""
if not os.path.exists(pdf_path):
raise FileNotFoundError(f"PDF not found: {pdf_path}")
reader = PdfReader(pdf_path)
start_page = len(self.writer.pages)
for page in reader.pages:
self.writer.add_page(page)
if bookmark:
self.bookmarks.append({
'title': bookmark,
'page': start_page
})
print(f"✓ Added {pdf_path} ({len(reader.pages)} pages)")
return self
def save(self, output_path):
"""Save merged PDF"""
with open(output_path, "wb") as f:
self.writer.write(f)
print(f"✓ Merged PDF saved: {output_path}")
def merge_files(self, input_files, output_file):
"""Merge multiple files at once"""
for pdf in input_files:
self.add_pdf(pdf, bookmark=os.path.basename(pdf))
self.save(output_file)
# Usage
merger = PDFMerger()
merger.merge_files(
["doc1.pdf", "doc2.pdf", "doc3.pdf"],
"merged.pdf"
)
Example 3: PDF Text Extractor
from pypdf import PdfReader
import json
class PDFExtractor:
def __init__(self, pdf_path):
self.reader = PdfReader(pdf_path)
self.metadata = {}
def get_metadata(self):
"""Extract PDF metadata"""
meta = self.reader.metadata
self.metadata = {
'title': meta.title if meta.title else None,
'author': meta.author if meta.author else None,
'subject': meta.subject if meta.subject else None,
'creator': meta.creator if meta.creator else None,
'pages': len(self.reader.pages),
'encrypted': self.reader.is_encrypted
}
return self.metadata
def extract_text(self, pages=None):
"""Extract text from specified pages or all"""
if pages is None:
pages = range(len(self.reader.pages))
text = {}
for page_num in pages:
page = self.reader.pages[page_num]
text[f"page_{page_num + 1}"] = page.extract_text()
return text
def extract_all(self, output_json):
"""Extract everything to JSON"""
result = {
'metadata': self.get_metadata(),
'text': self.extract_text()
}
with open(output_json, 'w', encoding='utf-8') as f:
json.dump(result, f, indent=2, ensure_ascii=False)
print(f"✓ Extracted to {output_json}")
return result
# Usage
extractor = PDFExtractor("document.pdf")
extractor.extract_all("extracted.json")
Error Handling
Common Errors
Error: "PDF read failed"
# Solution: Check if file exists and is valid
from pypdf import PdfReader
import os
if os.path.exists("file.pdf"):
try:
reader = PdfReader("file.pdf")
except Exception as e:
print(f"Error reading PDF: {e}")
Error: "Text extraction returned None"
# Solution: PDF may be scanned/image-based
# Use OCR instead
page = reader.pages[0]
text = page.extract_text()
if text is None or text.strip() == "":
print("PDF may be scanned. Use OCR.")
Error: "WeasyPrint font warning"
# Solution: Install fonts or use web-safe fonts
# Suppress warnings
import logging
logger = logging.getLogger('weasyprint')
logger.setLevel(logging.ERROR)
Best Practices
1. Always Check for Encryption
reader = PdfReader("file.pdf")
if reader.is_encrypted:
reader.decrypt("password")
2. Handle Large PDFs Efficiently
# Process page by page instead of loading all
reader = PdfReader("large.pdf")
for i, page in enumerate(reader.pages):
text = page.extract_text()
# Process and discard
3. Use Appropriate DPI for OCR
# 300 DPI is standard for OCR
images = convert_from_path("scanned.pdf", dpi=300)
4. Validate Before Distribution
# Check PDF is valid
from pypdf import PdfReader
try:
reader = PdfReader("output.pdf")
print(f"✓ Valid PDF with {len(reader.pages)} pages")
except Exception as e:
print(f"✗ Invalid PDF: {e}")
Testing Your Setup
# test-pdf.py
from pypdf import PdfReader, PdfWriter
from weasyprint import HTML
import os
print("Testing PDF setup...")
# Test 1: Create PDF from HTML
HTML(string="\x3Ch1>Test PDF\x3C/h1>").write_pdf("test-create.pdf")
assert os.path.exists("test-create.pdf")
print("✓ PDF creation test passed")
# Test 2: Read PDF
reader = PdfReader("test-create.pdf")
assert len(reader.pages) == 1
print("✓ PDF read test passed")
# Test 3: Extract text
text = reader.pages[0].extract_text()
assert "Test PDF" in text
print("✓ Text extraction test passed")
# Cleanup
os.remove("test-create.pdf")
print("✓ All tests passed!")
Run test:
python test-pdf.py
License
MIT License - See LICENSE file for details.
安全使用建议
This skill appears coherent for PDF work, but it's from an unknown source and depends on third-party packages and system binaries. Before installing or granting agent access: 1) review and pin package versions (avoid blindly running pip install without versions); 2) install packages in a virtual environment or isolated container and do not run as root; 3) be aware the skill needs filesystem access to read/write PDFs (limit to safe directories); 4) optional OCR requires system tools (tesseract, poppler) which you must install separately; 5) weasyprint may fetch remote assets when converting HTML — avoid supplying untrusted URLs to prevent remote requests; and 6) if you need higher assurance, request a skill with a known homepage or source repository and signed releases. If you want, I can produce a vetted requirements list with pinned versions and minimal install commands to reduce risk.
功能分析
Type: OpenClaw Skill
Name: pdf-skill
Version: 1.0.0
The PDF skill bundle provides standard documentation and Python code snippets for document processing using well-known libraries like pypdf, pdfplumber, and weasyprint. The content is strictly focused on PDF operations such as merging, splitting, OCR, and HTML conversion, with no evidence of malicious intent, data exfiltration, or prompt injection attacks in SKILL.md.
能力评估
Purpose & Capability
Name/description (create/read/edit/merge/split/OCR/HTML→PDF) match the instructions and libraries used (pypdf, pdfplumber, weasyprint, pytesseract, pdf2image, poppler). Required tools and packages are what you would expect for these tasks.
Instruction Scope
SKILL.md stays on-scope: it shows reading/writing local PDF files, table/text extraction, merges/splits, HTML-to-PDF conversion, and optional OCR/form-filling steps. It does instruct installing packages and system binaries. It can fetch external assets when using weasyprint or arbitrary file paths (thus may read any file the agent has filesystem access to), but it does not instruct exfiltration or access to unrelated system credentials.
Install Mechanism
This is an instruction-only skill (no install spec). It tells users to run 'pip install ...' and optionally 'brew install ...' — standard but potentially impactful because pip/Brew will execute third-party code on installation. No downloads from untrusted URLs or archive extraction steps are present.
Credentials
The skill declares no required environment variables, credentials, or config paths. The examples do show decrypt('password') as a usage pattern for encrypted PDFs, but there is no request for unrelated secrets or cloud credentials.
Persistence & Privilege
always is false and the skill does not request persistent or elevated platform privileges. The default ability for the model to call the skill autonomously remains (platform default) but is not accompanied by other concerning privileges.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install pdf-skill - 安装完成后,直接呼叫该 Skill 的名称或使用
/pdf-skill触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Major update: PDF skill upgraded with full-featured PDF processing support.
- Now supports creating, reading, editing, merging, and splitting PDFs.
- Added advanced capabilities: text & table extraction, OCR, HTML-to-PDF, watermarks, and form filling.
- Documentation now covers sample code and usage for all common PDF operations.
- Includes guidance on required and optional dependencies for extended features.
- Reorganized and expanded the API reference for easier use and discovery.
元数据
常见问题
pdf-skill 是什么?
Create, read, edit, merge, split PDF files. Supports text extraction, table extraction, form filling, watermarks, OCR, and HTML-to-PDF conversion. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 755 次。
如何安装 pdf-skill?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install pdf-skill」即可一键安装,无需额外配置。
pdf-skill 是免费的吗?
是的,pdf-skill 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
pdf-skill 支持哪些平台?
pdf-skill 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 pdf-skill?
由 weaglewang(@weaglewang)开发并维护,当前版本 v1.0.0。
推荐 Skills