← 返回 Skills 市场
Pdf Ocr Tool
作者
Xuan-You Lin
· GitHub ↗
· v1.3.0
578
总下载
0
收藏
4
当前安装
3
版本数
在 OpenClaw 中安装
/install pdf-ocr-tool
功能描述
Intelligent PDF and image to Markdown converter using Ollama GLM-OCR with smart content detection (text/table/figure)
安全使用建议
This skill appears to do what it says: convert PDFs/images to Markdown by calling an Ollama GLM-OCR model. Before installing, review and accept these points: (1) The tool sends images and prompts to the configured Ollama host (default localhost). Do not point OLLAMA_HOST to an untrusted remote endpoint if your documents contain sensitive data. (2) Install scripts pull pyproject/uv.lock from the skill's GitHub raw URL if local copies are missing — only proceed if you trust the upstream repository. (3) The skill requires pdftoppm (poppler) to convert PDFs; if missing it will still run for images only. (4) If you need stronger assurance, inspect utils/ollama_client.py to confirm network behavior and where data is posted, and run the post-install hooks manually rather than blindly executing remote install scripts.
功能分析
Type: OpenClaw Skill
Name: pdf-ocr-tool
Version: 1.3.0
The skill is classified as suspicious due to two significant vulnerabilities. First, the `hooks/install-deps.sh` script attempts to fetch `pyproject.toml` and `uv.lock` from a GitHub repository (`https://raw.githubusercontent.com/nala0222/pdf-ocr-tool/refs/heads/master/`) if local copies are not found. This introduces a supply chain risk, as a compromise of the GitHub repository could lead to the installation of malicious dependencies. Second, the `utils/pdf_utils.py` module uses `subprocess.run` to execute external binaries (`pdftoppm`, `pdfinfo`) with `pdf_path` directly derived from user input (`args.input` in `ocr_tool.py`). This creates a potential shell injection vulnerability if the input PDF path contains malicious shell metacharacters, allowing arbitrary command execution.
能力评估
Purpose & Capability
Name/description (PDF/image → Markdown using Ollama GLM-OCR) aligns with required binaries (ollama, pdftoppm) and the included code (OCR, page splitting, prompts). uv is used for dependency management and appears justified by the install instructions.
Instruction Scope
SKILL.md and the code limit actions to converting PDFs/images, splitting regions, invoking an Ollama service, and writing Markdown/images. However, the tool transmits image data and prompts to an Ollama host you configure (defaults to localhost). If you set the host to a remote service, document contents (possibly sensitive) will be sent over the network — this is expected for an OCR integration but worth noting.
Install Mechanism
Install uses uv (local Python package manager) and shell hooks that copy pyproject/uv.lock from the local tree or raw.githubusercontent.com. The scripts do not fetch arbitrary binaries from untrusted personal servers; they reference GitHub raw and instruct the user to run official install scripts for Ollama/uv. This is typical and proportionate to the task.
Credentials
The skill declares no required credentials or secret env vars. It supports OLLAMA_HOST/OLLAMA_PORT/OCR_MODEL configuration (optional), which is appropriate for selecting the target Ollama service and model. There are no unrelated credentials or config paths requested.
Persistence & Privilege
Skill does not request always: true and does not modify other skills or global agent settings. Install hooks operate within the skill directory and virtualenv; no elevated persistent privileges were requested.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install pdf-ocr-tool - 安装完成后,直接呼叫该 Skill 的名称或使用
/pdf-ocr-tool触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.3.0
Full English documentation, README added, all descriptions in English
v1.2.0
English prompts, install-deps.sh, fixed .gitignore for uv.lock
v1.1.0
**Big update: Adds mixed mode, region-based processing, and pyproject.toml support.**
- 新增混合模式(mixed)和分區處理(granularity region),可自動區分並處理不同內容區域
- 支援多種處理模式:text、table、figure、mixed、auto
- 增強自訂提示詞的配置功能
- 新增 pyproject.toml,使用 uv 管理 Python 依賴
- 更完善的安裝方式與使用說明,增強與 Ollama GLM-OCR、poppler、uv 的集成
元数据
常见问题
Pdf Ocr Tool 是什么?
Intelligent PDF and image to Markdown converter using Ollama GLM-OCR with smart content detection (text/table/figure). 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 578 次。
如何安装 Pdf Ocr Tool?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install pdf-ocr-tool」即可一键安装,无需额外配置。
Pdf Ocr Tool 是免费的吗?
是的,Pdf Ocr Tool 完全免费(开源免费),可自由下载、安装和使用。
Pdf Ocr Tool 支持哪些平台?
Pdf Ocr Tool 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Pdf Ocr Tool?
由 Xuan-You Lin(@tsukisama9292)开发并维护,当前版本 v1.3.0。
推荐 Skills