← 返回 Skills 市场
wangwllu

PDF Utils

作者 Lu Wang · GitHub ↗ · v1.0.1 · MIT-0
cross-platform ✓ 安全检测通过
219
总下载
0
收藏
0
当前安装
2
版本数
在 OpenClaw 中安装
/install pdf-utils
功能描述
PDF Utils enables OCR of image-based PDFs, extraction of arXiv IDs from text or OCR output, and scriptable PDF tasks like merging, splitting, and rendering.
安全使用建议
This skill appears coherent and implements what it claims. Before installing/using: (1) review and run the scripts on unprivileged/sample PDFs to confirm behavior; (2) be aware OCR requires installing the tesseract binary and language packs (SKILL.md suggests brew); (3) the extract_refs download option uses curl to fetch PDFs from arxiv.org — only enable downloads when you want network activity and ensure you trust the source; (4) the scripts write output files (papers/, temporary PNGs, OCR text) in locations you specify — run them in directories you control; (5) if you need higher assurance, inspect or run the included tests and review the small subprocess calls (curl, tesseract) which are expected for this functionality.
功能分析
Type: OpenClaw Skill Name: pdf-utils Version: 1.0.1 The pdf-utils skill is a legitimate utility for PDF processing, OCR, and arXiv reference extraction. It uses standard libraries like PyMuPDF (fitz) and Tesseract (pytesseract) for its core functionality. While it uses subprocess.run to invoke 'curl' for downloading papers and 'tesseract' for language checks, it does so safely by passing arguments as a list rather than a shell string, and the inputs (arXiv IDs) are validated via strict regex patterns. No evidence of data exfiltration, persistence, or malicious prompt injection was found.
能力评估
Purpose & Capability
The name/description (OCR, arXiv extraction, merge/split/render) matches the provided scripts and docs. The code only requires PyMuPDF, pytesseract, Pillow and the tesseract binary (all relevant to OCR and PDF processing). No unrelated binaries, env vars, or config paths are requested.
Instruction Scope
SKILL.md and the scripts are focused on local PDF processing. The scripts read PDFs, optionally OCR pages, extract arXiv identifiers, and (optionally) download PDFs from arxiv.org. They do not read arbitrary system credentials or other unrelated filesystem locations. Note: some scripts invoke subprocesses (curl for downloads and tesseract --list-langs) and will perform network downloads when the --download flag is used, which is consistent with the documented behavior.
Install Mechanism
This is an instruction-only skill (no install spec). SKILL.md recommends installing tesseract via brew and Python packages via pip. That is expected for OCR functionality but requires the user to run external installers (brew/pip) and to install tesseract language packs; ensure you run these from trusted package sources. No archive downloads or arbitrary URLs are used by an install step.
Credentials
The skill declares no required environment variables or credentials. The code does not attempt to access secrets or unrelated environment variables. Network access is used only to fetch papers from arxiv.org when the download option is selected.
Persistence & Privilege
The skill does not request always:true and does not modify other skills or global agent configuration. It runs as user-invocable code and will only create files/directories where the CLI is instructed to (e.g., output dir for downloads or OCR text).
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install pdf-utils
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /pdf-utils 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.1
1.0.1: clean patch release for metadata and publishing hygiene; remove artifact leakage from release flow; keep stable 1.0 test baseline.
v1.0.0
Stable 1.0.0 release: leaner skill layout, references/usage.md, pdf_ops.py for merge/split/render, shared arXiv parsing, stronger OCR dependency checks, regression tests, and packaging cleanup.
元数据
Slug pdf-utils
版本 1.0.1
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 2
常见问题

PDF Utils 是什么?

PDF Utils enables OCR of image-based PDFs, extraction of arXiv IDs from text or OCR output, and scriptable PDF tasks like merging, splitting, and rendering. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 219 次。

如何安装 PDF Utils?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install pdf-utils」即可一键安装,无需额外配置。

PDF Utils 是免费的吗?

是的,PDF Utils 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

PDF Utils 支持哪些平台?

PDF Utils 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 PDF Utils?

由 Lu Wang(@wangwllu)开发并维护,当前版本 v1.0.1。

💬 留言讨论