← 返回 Skills 市场

paddleocr-vl-locally

Name: paddleocr-vl-locally
Author: sfresurgam

作者 sfresurgam · GitHub ↗ · v1.0.2 · MIT-0

cross-platform ⚠ suspicious

288

总下载

当前安装

版本数

在 OpenClaw 中安装

/install paddleocr-vl-locally

功能描述

Complex document parsing with PaddleOCR. Intelligently converts complex PDFs and document images into Markdown and JSON files that preserve the original stru...

安全使用建议

Things to check before installing or running this skill: - Confirm environment variables: the registry lists only PADDLEOCR_DOC_PARSING_API_URL, but the code can also read PADDLEOCR_ACCESS_TOKEN, PADDLEOCR_BASIC_AUTH_USER, PADDLEOCR_BASIC_AUTH_PASSWORD, and PADDLEOCR_DOC_PARSING_TIMEOUT. If you will provide tokens/passwords, treat them as sensitive and verify the skill truly needs them. - Understand data exposure: the SKILL.md mandates showing the COMPLETE extracted content (all text, tables, formulas). If you plan to parse sensitive documents, this behavior can leak secrets or private information. Consider whether you want the agent to automatically reveal full outputs or prefer truncation/summarization/approval steps. - File persistence: results are saved by default under the system temp directory. Decide if that is acceptable; if not, use --stdout or a secure output path and remove temp files after processing. - Inspect and test locally: because the skill is script-based (no automatic install), review the included scripts (vl_caller.py, lib.py) and run smoke_test.py (or --skip-api-test) in a controlled environment. The socket/URL you configure for PADDLEOCR_DOC_PARSING_API_URL should be trusted (local or internal endpoint preferred). - Operational advice: restrict the API URL to an internal host if possible, rotate tokens used by the skill, and avoid enabling this skill for autonomous runs against sensitive data until you are comfortable with its behavior. If you want higher assurance, ask the author to: (1) list all environment variables in the skill metadata, (2) make the 'display full content' behavior opt-in, and (3) add an option to avoid writing results to disk by default.

功能分析

Type: OpenClaw Skill Name: paddleocr-vl-locally Version: 1.0.2 The skill bundle is a legitimate tool for document parsing via a PaddleOCR Triton Inference Server. The Python scripts (vl_caller.py, lib.py) implement standard API interaction using httpx, while utility scripts (optimize_file.py, split_pdf.py) provide helper functions for image compression and PDF page extraction. No evidence of malicious behavior, data exfiltration, or harmful prompt injection was found; the SKILL.md instructions correctly guide the agent on tool usage, error handling, and environment configuration.

能力评估

ℹ Purpose & Capability

Name/description align with the code: the scripts call a document-parsing API (Triton/PaddleOCR-style) and provide helpers to optimize/split files and save JSON results. Required binary (python) and the primary env var (PADDLEOCR_DOC_PARSING_API_URL) are appropriate. The presence of helper scripts (optimize_file.py, split_pdf.py) is consistent with supporting large/complex documents.

⚠ Instruction Scope

SKILL.md instructs the agent to ALWAYS use the external PaddleOCR Document Parsing API and NEVER parse locally (which is consistent with the code that sends files/URLs to the API). However, the SKILL.md also mandates displaying COMPLETE extracted content to the user and instructs the agent to read saved JSON files from the system temp directory before responding. These instructions broaden the agent's data exposure (showing full document text/tables/formulas without truncation) and require file I/O. The 'MANDATORY RESTRICTIONS' language is unusually prescriptive for an agent and could lead to indiscriminate disclosure of sensitive content.

ℹ Install Mechanism

There is no automated install spec (lower risk), but SKILL.md tells users to pip install dependencies from scripts/requirements*.txt. That is expected for a Python CLI skill. The requirements are minimal (httpx, optional Pillow/pypdfium2) and come from PyPI; no external/untrusted download URLs are used.

⚠ Credentials

Registry metadata declares only PADDLEOCR_DOC_PARSING_API_URL as a required env var, but the code actually reads additional environment variables (PADDLEOCR_ACCESS_TOKEN, PADDLEOCR_BASIC_AUTH_USER, PADDLEOCR_BASIC_AUTH_PASSWORD, PADDLEOCR_DOC_PARSING_TIMEOUT). Those optional credentials are plausible for authenticating to a proxied Triton server, but the omission from the declared requires.env is an inconsistency that should be clarified before trusting the skill with secrets.

ℹ Persistence & Privilege

The skill writes results to the system temp directory by default and prints the saved absolute path to stderr (and SKILL.md instructs the agent to read the saved JSON before responding). Writing parsed full-document JSON to disk is expected for this tool, but it leaves persistent artifacts containing potentially sensitive data. The skill does not request elevated system privileges and always=false.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install paddleocr-vl-locally
安装完成后，直接呼叫该 Skill 的名称或使用 /paddleocr-vl-locally 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.2

No user-facing changes were detected in this release. - Internal or metadata updates may have been made without affecting usage or documentation.

v1.0.1

- Skill now renamed to **paddleocr-vl-locally**. - No longer requires `PADDLEOCR_ACCESS_TOKEN`; only `PADDLEOCR_DOC_PARSING_API_URL` is needed. - Instructions updated for local deployment: configure the API URL to your local Triton inference endpoint. - Simplified configuration guidance and clarified that access tokens are not required for local use.

v1.0.0

Initial release of PaddleOCR Document Parsing Skill. - Enables advanced document parsing using the PaddleOCR Document Parsing API. - Converts complex PDFs and document images into structured Markdown and JSON, preserving original layout (tables, formulas, charts, multi-column, etc.). - Provides clear usage instructions: only interacts via the official API/script and never performs parsing directly. - Returns complete, unabridged document content as requested (text, tables, formulas, etc.); does not summarize or truncate unless output is extremely long. - Handles errors transparently and guides users on secure API and token configuration. - Supports both URL and local file input, with customizable output modes (file, stdout). - Emphasizes extraction completeness, structured metadata, and consistent output behavior. - PaddleOCR-VL service adapted for localized deployment

元数据

Slug paddleocr-vl-locally

版本 1.0.2

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 3

常见问题

paddleocr-vl-locally 是什么？

Complex document parsing with PaddleOCR. Intelligently converts complex PDFs and document images into Markdown and JSON files that preserve the original stru... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 288 次。

如何安装 paddleocr-vl-locally？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install paddleocr-vl-locally」即可一键安装，无需额外配置。

paddleocr-vl-locally 是免费的吗？

是的，paddleocr-vl-locally 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

paddleocr-vl-locally 支持哪些平台？

paddleocr-vl-locally 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 paddleocr-vl-locally？

由 sfresurgam（@sfresurgam）开发并维护，当前版本 v1.0.2。