← 返回 Skills 市场

pdf-ocr

Name: pdf-ocr
Author: yejinlei

作者 yejinlei · GitHub ↗ · v2.2.0 · MIT-0

cross-platform ⚠ suspicious

491

总下载

当前安装

版本数

在 OpenClaw 中安装

/install pdf-ocr-skill

功能描述

支持双引擎的PDF OCR识别技能，可从影印版PDF文件和图片文件中提取文字内容

安全使用建议

This skill appears to implement the advertised OCR functionality, but review these before installing: - The registry metadata omits required env vars but SKILL.md/.env expect SILICON_FLOW_API_KEY for the cloud engine — treat the cloud engine as requiring a secret key. - The code will auto-install Python packages with pip at runtime (subprocess pip install). That can change your environment and pull code from PyPI; prefer installing dependencies yourself in a virtualenv or review required packages and versions first. - If you enable the cloud engine, the skill uploads full images (base64) to https://api.siliconflow.cn — do not use the cloud engine for sensitive documents unless you trust the service and the API key handling. Consider running RapidOCR (local) only for private data. - Verify the vendor/source (homepage is missing and source is 'unknown'). If you need to trust this skill long-term, obtain it from a known repository or author, inspect the full code (including the truncated parts) and test in a sandbox environment. If you want to proceed safely: run the skill in an isolated environment (virtualenv/container), manually install and pin dependencies from requirements.txt, avoid configuring the cloud API key unless necessary, and audit network calls/logging to ensure no unexpected endpoints receive your data.

功能分析

Type: OpenClaw Skill Name: pdf-ocr-skill Version: 2.2.0 The skill bundle contains a high-risk behavior in `scripts/pdf_ocr_processor.py`, where it defines an `install_dependency` function that uses `subprocess.check_call` to automatically execute `pip install` for missing libraries at runtime. While the currently hardcoded packages (rapidocr_onnxruntime, pymupdf, pillow) are legitimate, this pattern of auto-installing dependencies is a common vector for supply chain risks and unauthorized code execution. The rest of the bundle, including the prompt instructions in `SKILL.md` and the integration with the SiliconFlow API (api.siliconflow.cn), appears consistent with its stated purpose as an OCR utility.

能力评估

⚠ Purpose & Capability

Name/description, SKILL.md and the included Python code are coherent: they implement a PDF/image OCR processor with a local engine (RapidOCR) and an optional cloud engine (SiliconFlow). However the registry metadata declares no required environment variables or credentials while the SKILL.md and code clearly expect an optional SILICON_FLOW_API_KEY for the cloud engine — this metadata omission is an inconsistency that reduces transparency.

ℹ Instruction Scope

SKILL.md and examples stick to OCR tasks (convert PDF→images, run OCR, save text). They instruct providing an API key when using the cloud engine. They do not instruct reading unrelated system files. One area to note: the skill will send full image data (base64) to the external siliconflow API when that engine is used — this is expected for cloud OCR but is sensitive (images may contain private data) and the docs do not strongly call out privacy/exfiltration implications.

⚠ Install Mechanism

There is no install spec in the registry (instruction-only), but the runtime code will attempt to auto-install missing Python packages by invoking pip via subprocess at runtime. Auto-installing packages during execution can modify the runtime environment and pull arbitrary code from PyPI — this increases risk compared with a purely instruction-only skill that requires manual dependency installation.

⚠ Credentials

The skill only needs one service credential in practice (SILICON_FLOW_API_KEY) for the optional cloud engine, which is proportionate. However the registry declared no required env vars while the SKILL.md and .env.example explicitly document SILICON_FLOW_API_KEY and OCR_ENGINE. The lack of declared credentials in metadata reduces transparency. Also sending base64 image data to api.siliconflow.cn is a sensitive operation that you should only enable if you trust that service and key usage.

✓ Persistence & Privilege

Skill flags are default: not always-on, user-invocable, and allows autonomous invocation (platform default). The package does not request elevated system privileges or attempt to modify other skills or global agent settings in the provided files.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install pdf-ocr-skill
安装完成后，直接呼叫该 Skill 的名称或使用 /pdf-ocr-skill 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v2.2.0

- 新增支持双 OCR 引擎，适配 RapidOCR（本地）与硅基流动 API（云端）。 - 增强自动引擎切换：RapidOCR 初始化失败时自动切换到硅基流动 API。 - 支持多种图片格式的文字识别，扩展支持 JPG、PNG、BMP、GIF、TIFF、WEBP。 - 完善使用文档，新增命令行与批量处理示例。 - 增加用户交互提示词，方便通过助手指定 OCR 引擎。 - 更新故障排除指引，帮助定位常见问题。

元数据

Slug pdf-ocr-skill

版本 2.2.0

许可证 MIT-0

累计安装 1

当前安装数 1

历史版本数 1

常见问题

pdf-ocr 是什么？

支持双引擎的PDF OCR识别技能，可从影印版PDF文件和图片文件中提取文字内容. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 491 次。

如何安装 pdf-ocr？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install pdf-ocr-skill」即可一键安装，无需额外配置。

pdf-ocr 是免费的吗？

是的，pdf-ocr 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

pdf-ocr 支持哪些平台？

pdf-ocr 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 pdf-ocr？

由 yejinlei（@yejinlei）开发并维护，当前版本 v2.2.0。