← 返回 Skills 市场
40
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install pdf-sanitizer
功能描述
Detect and redact sensitive information in PDFs — ID numbers, phone numbers, addresses, bank cards.
使用说明 (SKILL.md)
PDF Sanitizer
Detect and redact sensitive information in PDF documents while preserving original layout.
Workflow
- Ingest PDF — extract text layer and metadata via pdfplumber/PyMuPDF.
- Scan for PII — run regex + AI pattern matching against Chinese and international PII:
- Chinese ID number (18-digit)
- Chinese phone numbers
- Bank card numbers
- Email addresses
- Residential addresses (Chinese)
- Person names (context-based)
- Highlight — annotate every match with bounding boxes and category labels.
- Confirm — present categories to user for selection. Default: all categories enabled.
- Redact — apply chosen mode per category:
blackout— solid black rectangle over sensitive textblur— pixel-level Gaussian blur on image-rendered areaplaceholder— replace with[REDACTED]while keeping surrounding text
- Rebuild PDF — flatten redactions into final output, preserving original fonts, images, and layout.
- Report — output redacted PDF + JSON report listing each redaction:
- original snippet (truncated), category, page number, bounding box, mode applied.
Sample Prompt
pdf-sanitizer redact --input contract.pdf --categories id_card,phone,address --mode blackout
pdf-sanitizer redact --input 社保材料.pdf --output clean.pdf --categories all --mode placeholder
pdf-sanitizer scan --input report.pdf
pdf-sanitizer review --input contract.pdf --page 3-7
安全使用建议
Before installing, confirm that your workflow is acceptable with JSON reports that may include truncated PII and exact page locations. Store reports with the same protections as the original PDFs, and prefer masked or category-only reporting for highly sensitive documents.
能力评估
Purpose & Capability
The artifacts coherently describe detecting and redacting PII in PDFs; the included Python script performs local regex detection on stdin and emits JSON matches.
Instruction Scope
The workflow discloses scanning, user confirmation, redaction modes, and report generation, but the script includes passport and IPv4 detection that are not explicitly listed in the short description or workflow category list.
Install Mechanism
The artifact contains a SKILL.md and one small Python helper script, with no installer, package fetch, network behavior, auto-start hook, or hidden setup mechanism.
Credentials
Reading PDF text and detecting PII is proportional to the stated sanitizer purpose; no evidence shows credential use, network transmission, unrelated file access, or broad local indexing.
Persistence & Privilege
The documented JSON report can retain truncated original snippets and precise bounding boxes, which is privacy-sensitive, but it is disclosed and there is no evidence of background persistence or privilege escalation.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install pdf-sanitizer - 安装完成后,直接呼叫该 Skill 的名称或使用
/pdf-sanitizer触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release: Detect and redact Chinese PII in PDFs
元数据
常见问题
PDF Sanitizer 是什么?
Detect and redact sensitive information in PDFs — ID numbers, phone numbers, addresses, bank cards. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 40 次。
如何安装 PDF Sanitizer?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install pdf-sanitizer」即可一键安装,无需额外配置。
PDF Sanitizer 是免费的吗?
是的,PDF Sanitizer 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
PDF Sanitizer 支持哪些平台?
PDF Sanitizer 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 PDF Sanitizer?
由 haidong(@harrylabsj)开发并维护,当前版本 v1.0.0。
推荐 Skills