← 返回 Skills 市场

完美排版ocr

Name: 完美排版ocr
Author: biabia-55

作者 gamhtoi · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ⚠ suspicious

总下载

当前安装

版本数

在 OpenClaw 中安装

/install pdf-ocr-layout-free

功能描述

Full OCR pipeline for scanned PDFs with layout preservation. Use this skill whenever the user wants to OCR a PDF, convert a scanned document to searchable te...

安全使用建议

This skill will upload entire PDF chunks to a third‑party OCR service (paddleocr.aistudio-app.com) and will fetch remote images referenced by OCR results. The code expects an environment variable PADDLEOCR_TOKEN, but the skill metadata does not declare that — you must set it yourself or the script will attempt to use a placeholder token. Before installing or running: (1) Do not process sensitive documents unless you trust the remote service and token; (2) Verify the OCR endpoint and its privacy/security policy; (3) Prefer using a self-hosted/local OCR alternative if you need confidentiality; (4) Run the script in an isolated environment (sandbox or VM) if you must test it; (5) Consider asking the publisher to update the registry metadata to declare PADDLEOCR_TOKEN and to explicitly disclose that PDFs are uploaded externally. If you want, I can point out the exact lines that send files and read the token so you can review them or suggest edits to make the behavior local-only.

功能分析

Type: OpenClaw Skill Name: pdf-ocr-layout-free Version: 1.0.0 The skill provides a legitimate PDF OCR pipeline using the PaddleOCR API (aistudio-app.com). It includes logic for splitting PDFs, submitting chunks to the API, and reconstructing the document with layout preservation using the reportlab library. All actions, including external API communication and image fetching from the API's response, are consistent with the stated purpose. No malicious patterns, obfuscation, or prompt injection attempts were identified in the code or instructions.

能力评估

⚠ Purpose & Capability

The code and SKILL.md implement exactly a remote-OCR pipeline (split → submit → poll → render) which fits the stated purpose. However, the script requires an API token (PADDLEOCR_TOKEN) and uses a remote endpoint (paddleocr.aistudio-app.com) while the skill's declared requirements list no environment variables or credentials. The missing declaration is an incoherence.

⚠ Instruction Scope

Runtime instructions tell the agent to pip-install dependencies and run the included pipeline script. The SKILL.md does not tell the user to set the API token, does not warn that full PDF contents will be uploaded to a remote service, and does not surface the exact remote endpoint — the agent will therefore transmit potentially sensitive documents without an explicit consent/notice step.

✓ Install Mechanism

This is an instruction-only skill with an included script; there is no installer that downloads arbitrary code from unknown URLs. Dependencies are installed via pip at runtime per SKILL.md. No high-risk install URLs or archive extraction are present.

⚠ Credentials

The Python code reads PADDLEOCR_TOKEN from the environment (and falls back to a placeholder), but the skill metadata declares no required env vars or primary credential. Requesting a single OCR API token would be proportional to the task, but failing to declare it in the registry is a transparency issue and increases risk of accidental data leaks.

✓ Persistence & Privilege

The skill is not always-enabled and does not request special agent privileges. It writes resumable state and intermediate files to a work directory (jobs.json, chunk_* files) which is normal for a pipeline; nothing in the package attempts to alter other skills or agent-wide settings.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install pdf-ocr-layout-free
安装完成后，直接呼叫该 Skill 的名称或使用 /pdf-ocr-layout-free 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

- Full OCR pipeline: split → PaddleOCR-VL-1.5 API → layout PDF → merge - Bbox-based text placement with auto font-size calibration - Dynamic source image size detection (handles any scan resolution) - Image embedding via API CDN URLs (bbox-based matching for JSONL format) - Multi-line block font cap (fixes reference list / footnote line spacing) - Overflow protection: shrinks font if text exceeds bbox - Resume-safe: caches job IDs, JSONL results, and chunk PDFs

元数据

Slug pdf-ocr-layout-free

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

完美排版ocr 是什么？

Full OCR pipeline for scanned PDFs with layout preservation. Use this skill whenever the user wants to OCR a PDF, convert a scanned document to searchable te... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 99 次。

如何安装完美排版ocr？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install pdf-ocr-layout-free」即可一键安装，无需额外配置。

完美排版ocr 是免费的吗？

是的，完美排版ocr 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

完美排版ocr 支持哪些平台？

完美排版ocr 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了完美排版ocr？

由 gamhtoi（@biabia-55）开发并维护，当前版本 v1.0.0。

完美排版ocr

完美排版ocr 是什么？

如何安装 完美排版ocr？

完美排版ocr 是免费的吗？

完美排版ocr 支持哪些平台？

谁开发了 完美排版ocr？

💬 留言讨论

如何安装完美排版ocr？

谁开发了完美排版ocr？