← 返回 Skills 市场
sfresurgam

paddleocr-vl-locally

作者 sfresurgam · GitHub ↗ · v1.0.2 · MIT-0
cross-platform ⚠ suspicious
288
总下载
0
收藏
0
当前安装
3
版本数
在 OpenClaw 中安装
/install paddleocr-vl-locally
功能描述
Complex document parsing with PaddleOCR. Intelligently converts complex PDFs and document images into Markdown and JSON files that preserve the original stru...
安全使用建议
Things to check before installing or running this skill: - Confirm environment variables: the registry lists only PADDLEOCR_DOC_PARSING_API_URL, but the code can also read PADDLEOCR_ACCESS_TOKEN, PADDLEOCR_BASIC_AUTH_USER, PADDLEOCR_BASIC_AUTH_PASSWORD, and PADDLEOCR_DOC_PARSING_TIMEOUT. If you will provide tokens/passwords, treat them as sensitive and verify the skill truly needs them. - Understand data exposure: the SKILL.md mandates showing the COMPLETE extracted content (all text, tables, formulas). If you plan to parse sensitive documents, this behavior can leak secrets or private information. Consider whether you want the agent to automatically reveal full outputs or prefer truncation/summarization/approval steps. - File persistence: results are saved by default under the system temp directory. Decide if that is acceptable; if not, use --stdout or a secure output path and remove temp files after processing. - Inspect and test locally: because the skill is script-based (no automatic install), review the included scripts (vl_caller.py, lib.py) and run smoke_test.py (or --skip-api-test) in a controlled environment. The socket/URL you configure for PADDLEOCR_DOC_PARSING_API_URL should be trusted (local or internal endpoint preferred). - Operational advice: restrict the API URL to an internal host if possible, rotate tokens used by the skill, and avoid enabling this skill for autonomous runs against sensitive data until you are comfortable with its behavior. If you want higher assurance, ask the author to: (1) list all environment variables in the skill metadata, (2) make the 'display full content' behavior opt-in, and (3) add an option to avoid writing results to disk by default.
功能分析
Type: OpenClaw Skill Name: paddleocr-vl-locally Version: 1.0.2 The skill bundle is a legitimate tool for document parsing via a PaddleOCR Triton Inference Server. The Python scripts (vl_caller.py, lib.py) implement standard API interaction using httpx, while utility scripts (optimize_file.py, split_pdf.py) provide helper functions for image compression and PDF page extraction. No evidence of malicious behavior, data exfiltration, or harmful prompt injection was found; the SKILL.md instructions correctly guide the agent on tool usage, error handling, and environment configuration.
能力评估
Purpose & Capability
Name/description align with the code: the scripts call a document-parsing API (Triton/PaddleOCR-style) and provide helpers to optimize/split files and save JSON results. Required binary (python) and the primary env var (PADDLEOCR_DOC_PARSING_API_URL) are appropriate. The presence of helper scripts (optimize_file.py, split_pdf.py) is consistent with supporting large/complex documents.
Instruction Scope
SKILL.md instructs the agent to ALWAYS use the external PaddleOCR Document Parsing API and NEVER parse locally (which is consistent with the code that sends files/URLs to the API). However, the SKILL.md also mandates displaying COMPLETE extracted content to the user and instructs the agent to read saved JSON files from the system temp directory before responding. These instructions broaden the agent's data exposure (showing full document text/tables/formulas without truncation) and require file I/O. The 'MANDATORY RESTRICTIONS' language is unusually prescriptive for an agent and could lead to indiscriminate disclosure of sensitive content.
Install Mechanism
There is no automated install spec (lower risk), but SKILL.md tells users to pip install dependencies from scripts/requirements*.txt. That is expected for a Python CLI skill. The requirements are minimal (httpx, optional Pillow/pypdfium2) and come from PyPI; no external/untrusted download URLs are used.
Credentials
Registry metadata declares only PADDLEOCR_DOC_PARSING_API_URL as a required env var, but the code actually reads additional environment variables (PADDLEOCR_ACCESS_TOKEN, PADDLEOCR_BASIC_AUTH_USER, PADDLEOCR_BASIC_AUTH_PASSWORD, PADDLEOCR_DOC_PARSING_TIMEOUT). Those optional credentials are plausible for authenticating to a proxied Triton server, but the omission from the declared requires.env is an inconsistency that should be clarified before trusting the skill with secrets.
Persistence & Privilege
The skill writes results to the system temp directory by default and prints the saved absolute path to stderr (and SKILL.md instructs the agent to read the saved JSON before responding). Writing parsed full-document JSON to disk is expected for this tool, but it leaves persistent artifacts containing potentially sensitive data. The skill does not request elevated system privileges and always=false.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install paddleocr-vl-locally
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /paddleocr-vl-locally 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.2
No user-facing changes were detected in this release. - Internal or metadata updates may have been made without affecting usage or documentation.
v1.0.1
- Skill now renamed to **paddleocr-vl-locally**. - No longer requires `PADDLEOCR_ACCESS_TOKEN`; only `PADDLEOCR_DOC_PARSING_API_URL` is needed. - Instructions updated for local deployment: configure the API URL to your local Triton inference endpoint. - Simplified configuration guidance and clarified that access tokens are not required for local use.
v1.0.0
Initial release of PaddleOCR Document Parsing Skill. - Enables advanced document parsing using the PaddleOCR Document Parsing API. - Converts complex PDFs and document images into structured Markdown and JSON, preserving original layout (tables, formulas, charts, multi-column, etc.). - Provides clear usage instructions: only interacts via the official API/script and never performs parsing directly. - Returns complete, unabridged document content as requested (text, tables, formulas, etc.); does not summarize or truncate unless output is extremely long. - Handles errors transparently and guides users on secure API and token configuration. - Supports both URL and local file input, with customizable output modes (file, stdout). - Emphasizes extraction completeness, structured metadata, and consistent output behavior. - PaddleOCR-VL service adapted for localized deployment
元数据
Slug paddleocr-vl-locally
版本 1.0.2
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 3
常见问题

paddleocr-vl-locally 是什么?

Complex document parsing with PaddleOCR. Intelligently converts complex PDFs and document images into Markdown and JSON files that preserve the original stru... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 288 次。

如何安装 paddleocr-vl-locally?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install paddleocr-vl-locally」即可一键安装,无需额外配置。

paddleocr-vl-locally 是免费的吗?

是的,paddleocr-vl-locally 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

paddleocr-vl-locally 支持哪些平台?

paddleocr-vl-locally 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 paddleocr-vl-locally?

由 sfresurgam(@sfresurgam)开发并维护,当前版本 v1.0.2。

💬 留言讨论