← 返回 Skills 市场
491
总下载
1
收藏
1
当前安装
1
版本数
在 OpenClaw 中安装
/install pdf-ocr-skill
功能描述
支持双引擎的PDF OCR识别技能,可从影印版PDF文件和图片文件中提取文字内容
安全使用建议
This skill appears to implement the advertised OCR functionality, but review these before installing:
- The registry metadata omits required env vars but SKILL.md/.env expect SILICON_FLOW_API_KEY for the cloud engine — treat the cloud engine as requiring a secret key.
- The code will auto-install Python packages with pip at runtime (subprocess pip install). That can change your environment and pull code from PyPI; prefer installing dependencies yourself in a virtualenv or review required packages and versions first.
- If you enable the cloud engine, the skill uploads full images (base64) to https://api.siliconflow.cn — do not use the cloud engine for sensitive documents unless you trust the service and the API key handling. Consider running RapidOCR (local) only for private data.
- Verify the vendor/source (homepage is missing and source is 'unknown'). If you need to trust this skill long-term, obtain it from a known repository or author, inspect the full code (including the truncated parts) and test in a sandbox environment.
If you want to proceed safely: run the skill in an isolated environment (virtualenv/container), manually install and pin dependencies from requirements.txt, avoid configuring the cloud API key unless necessary, and audit network calls/logging to ensure no unexpected endpoints receive your data.
功能分析
Type: OpenClaw Skill
Name: pdf-ocr-skill
Version: 2.2.0
The skill bundle contains a high-risk behavior in `scripts/pdf_ocr_processor.py`, where it defines an `install_dependency` function that uses `subprocess.check_call` to automatically execute `pip install` for missing libraries at runtime. While the currently hardcoded packages (rapidocr_onnxruntime, pymupdf, pillow) are legitimate, this pattern of auto-installing dependencies is a common vector for supply chain risks and unauthorized code execution. The rest of the bundle, including the prompt instructions in `SKILL.md` and the integration with the SiliconFlow API (api.siliconflow.cn), appears consistent with its stated purpose as an OCR utility.
能力评估
Purpose & Capability
Name/description, SKILL.md and the included Python code are coherent: they implement a PDF/image OCR processor with a local engine (RapidOCR) and an optional cloud engine (SiliconFlow). However the registry metadata declares no required environment variables or credentials while the SKILL.md and code clearly expect an optional SILICON_FLOW_API_KEY for the cloud engine — this metadata omission is an inconsistency that reduces transparency.
Instruction Scope
SKILL.md and examples stick to OCR tasks (convert PDF→images, run OCR, save text). They instruct providing an API key when using the cloud engine. They do not instruct reading unrelated system files. One area to note: the skill will send full image data (base64) to the external siliconflow API when that engine is used — this is expected for cloud OCR but is sensitive (images may contain private data) and the docs do not strongly call out privacy/exfiltration implications.
Install Mechanism
There is no install spec in the registry (instruction-only), but the runtime code will attempt to auto-install missing Python packages by invoking pip via subprocess at runtime. Auto-installing packages during execution can modify the runtime environment and pull arbitrary code from PyPI — this increases risk compared with a purely instruction-only skill that requires manual dependency installation.
Credentials
The skill only needs one service credential in practice (SILICON_FLOW_API_KEY) for the optional cloud engine, which is proportionate. However the registry declared no required env vars while the SKILL.md and .env.example explicitly document SILICON_FLOW_API_KEY and OCR_ENGINE. The lack of declared credentials in metadata reduces transparency. Also sending base64 image data to api.siliconflow.cn is a sensitive operation that you should only enable if you trust that service and key usage.
Persistence & Privilege
Skill flags are default: not always-on, user-invocable, and allows autonomous invocation (platform default). The package does not request elevated system privileges or attempt to modify other skills or global agent settings in the provided files.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install pdf-ocr-skill - 安装完成后,直接呼叫该 Skill 的名称或使用
/pdf-ocr-skill触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v2.2.0
- 新增支持双 OCR 引擎,适配 RapidOCR(本地)与硅基流动 API(云端)。
- 增强自动引擎切换:RapidOCR 初始化失败时自动切换到硅基流动 API。
- 支持多种图片格式的文字识别,扩展支持 JPG、PNG、BMP、GIF、TIFF、WEBP。
- 完善使用文档,新增命令行与批量处理示例。
- 增加用户交互提示词,方便通过助手指定 OCR 引擎。
- 更新故障排除指引,帮助定位常见问题。
元数据
常见问题
pdf-ocr 是什么?
支持双引擎的PDF OCR识别技能,可从影印版PDF文件和图片文件中提取文字内容. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 491 次。
如何安装 pdf-ocr?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install pdf-ocr-skill」即可一键安装,无需额外配置。
pdf-ocr 是免费的吗?
是的,pdf-ocr 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
pdf-ocr 支持哪些平台?
pdf-ocr 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 pdf-ocr?
由 yejinlei(@yejinlei)开发并维护,当前版本 v2.2.0。
推荐 Skills