← 返回 Skills 市场

Pdf Ocr Tool

Name: Pdf Ocr Tool
Author: tsukisama9292

作者 Xuan-You Lin · GitHub ↗ · v1.3.0

cross-platform ⚠ suspicious

578

总下载

当前安装

版本数

在 OpenClaw 中安装

/install pdf-ocr-tool

功能描述

Intelligent PDF and image to Markdown converter using Ollama GLM-OCR with smart content detection (text/table/figure)

安全使用建议

This skill appears to do what it says: convert PDFs/images to Markdown by calling an Ollama GLM-OCR model. Before installing, review and accept these points: (1) The tool sends images and prompts to the configured Ollama host (default localhost). Do not point OLLAMA_HOST to an untrusted remote endpoint if your documents contain sensitive data. (2) Install scripts pull pyproject/uv.lock from the skill's GitHub raw URL if local copies are missing — only proceed if you trust the upstream repository. (3) The skill requires pdftoppm (poppler) to convert PDFs; if missing it will still run for images only. (4) If you need stronger assurance, inspect utils/ollama_client.py to confirm network behavior and where data is posted, and run the post-install hooks manually rather than blindly executing remote install scripts.

功能分析

Type: OpenClaw Skill Name: pdf-ocr-tool Version: 1.3.0 The skill is classified as suspicious due to two significant vulnerabilities. First, the `hooks/install-deps.sh` script attempts to fetch `pyproject.toml` and `uv.lock` from a GitHub repository (`https://raw.githubusercontent.com/nala0222/pdf-ocr-tool/refs/heads/master/`) if local copies are not found. This introduces a supply chain risk, as a compromise of the GitHub repository could lead to the installation of malicious dependencies. Second, the `utils/pdf_utils.py` module uses `subprocess.run` to execute external binaries (`pdftoppm`, `pdfinfo`) with `pdf_path` directly derived from user input (`args.input` in `ocr_tool.py`). This creates a potential shell injection vulnerability if the input PDF path contains malicious shell metacharacters, allowing arbitrary command execution.

能力评估

✓ Purpose & Capability

Name/description (PDF/image → Markdown using Ollama GLM-OCR) aligns with required binaries (ollama, pdftoppm) and the included code (OCR, page splitting, prompts). uv is used for dependency management and appears justified by the install instructions.

ℹ Instruction Scope

SKILL.md and the code limit actions to converting PDFs/images, splitting regions, invoking an Ollama service, and writing Markdown/images. However, the tool transmits image data and prompts to an Ollama host you configure (defaults to localhost). If you set the host to a remote service, document contents (possibly sensitive) will be sent over the network — this is expected for an OCR integration but worth noting.

✓ Install Mechanism

Install uses uv (local Python package manager) and shell hooks that copy pyproject/uv.lock from the local tree or raw.githubusercontent.com. The scripts do not fetch arbitrary binaries from untrusted personal servers; they reference GitHub raw and instruct the user to run official install scripts for Ollama/uv. This is typical and proportionate to the task.

✓ Credentials

The skill declares no required credentials or secret env vars. It supports OLLAMA_HOST/OLLAMA_PORT/OCR_MODEL configuration (optional), which is appropriate for selecting the target Ollama service and model. There are no unrelated credentials or config paths requested.

✓ Persistence & Privilege

Skill does not request always: true and does not modify other skills or global agent settings. Install hooks operate within the skill directory and virtualenv; no elevated persistent privileges were requested.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install pdf-ocr-tool
安装完成后，直接呼叫该 Skill 的名称或使用 /pdf-ocr-tool 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.3.0

Full English documentation, README added, all descriptions in English

v1.2.0

English prompts, install-deps.sh, fixed .gitignore for uv.lock

v1.1.0

**Big update: Adds mixed mode, region-based processing, and pyproject.toml support.** - 新增混合模式（mixed）和分區處理（granularity region），可自動區分並處理不同內容區域 - 支援多種處理模式：text、table、figure、mixed、auto - 增強自訂提示詞的配置功能 - 新增 pyproject.toml，使用 uv 管理 Python 依賴 - 更完善的安裝方式與使用說明，增強與 Ollama GLM-OCR、poppler、uv 的集成

元数据

Slug pdf-ocr-tool

版本 1.3.0

许可证 —

累计安装 4

当前安装数 4

历史版本数 3

常见问题

Pdf Ocr Tool 是什么？

Intelligent PDF and image to Markdown converter using Ollama GLM-OCR with smart content detection (text/table/figure). 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 578 次。

如何安装 Pdf Ocr Tool？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install pdf-ocr-tool」即可一键安装，无需额外配置。

Pdf Ocr Tool 是免费的吗？

是的，Pdf Ocr Tool 完全免费（开源免费），可自由下载、安装和使用。

Pdf Ocr Tool 支持哪些平台？

Pdf Ocr Tool 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Pdf Ocr Tool？

由 Xuan-You Lin（@tsukisama9292）开发并维护，当前版本 v1.3.0。