← 返回 Skills 市场

PaddleOCR-VL

Name: PaddleOCR-VL
Author: jessy-huang

作者 Jessy-Huang · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ⚠ suspicious

总下载

当前安装

版本数

在 OpenClaw 中安装

/install paddle-ocr-vl

功能描述

GPU-accelerated document parsing and OCR via PaddleOCR-VL. Detects layout, recognizes Chinese/English text, tables, charts, and seals in images. Use when the...

使用说明 (SKILL.md)

PaddleOCR-VL Skill

GPU-accelerated document OCR using the PaddleOCR-VL model running inside an ephemeral Docker container. Auto-detects NVIDIA GPU architecture (Blackwell SM120 vs. standard) and selects the correct official image.

When to Use

User provides an image path and asks to "read", "OCR", or "extract text" from it
User wants to parse a document screenshot, newspaper page, or classical text
User asks about the content of an image file

Architecture

This skill includes an MCP server (server.py) that exposes three tools:

Tool	Purpose
`run_ocr`	OCR any image — provide an absolute path
`check_environment`	Verify Docker, GPU drivers, and image are ready
`run_demo`	Run OCR on bundled demo images to test the setup

Setup

1. Install the MCP Server

Add to ~/.config/Claude/claude_desktop_config.json (Claude Desktop) or ~/.claude/settings.json (Claude Code):

{
  "mcpServers": {
    "paddle-ocr-vl": {
      "command": "python3",
      "args": ["\x3CINSTALL_DIR>/server.py"]
    }
  }
}

2. Pull the Docker Image (one-time)

# Blackwell GPU (RTX 50xx, B100/B200 — compute capability >= 12.0):
docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddleocr-vl:latest-nvidia-gpu-sm120

# Other NVIDIA GPU:
docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddleocr-vl:latest-nvidia-gpu

3. Verify

Call check_environment to verify everything is set up, then run_demo to test on the bundled sample images.

Bundled Demo Images

File	Content
`demo/newspaper.png`	People's Daily article about China-Eritrea relations
`demo/classical_text.png`	Records of the Three Kingdoms, vertical classical Chinese

Requirements

Docker with nvidia-container-toolkit
NVIDIA GPU with drivers installed
Python 3.10+ with mcp SDK (pip install mcp)

Security & Privacy

Images are processed inside an ephemeral Docker container (--rm flag)
The container has no network access beyond --network host (needed for GPU)
No data leaves the host machine
The container is destroyed immediately after each OCR run

External Endpoints

None. All processing is local.

Official References

PaddleOCR-VL docs: https://www.paddleocr.ai/latest/version3.x/pipeline_usage/PaddleOCR-VL.html
Blackwell-specific: https://www.paddleocr.ai/latest/version3.x/pipeline_usage/PaddleOCR-VL-NVIDIA-Blackwell.html

安全使用建议

Review before installing. Only use this with trusted images and trusted file paths, and avoid running it on files with unusual or attacker-controlled filenames. Prefer a revised version that disables host networking, avoids root, mounts only the target file read-only, and safely passes the path into the container.

能力评估

ℹ Purpose & Capability

Running PaddleOCR-VL in Docker matches the stated OCR purpose, and the exposed tools are limited to OCR, environment checking, and demos.

⚠ Instruction Scope

The security notes claim local/no-external behavior while the runtime uses host networking and an external Docker image, so users are not clearly informed about the actual network exposure.

ℹ Install Mechanism

Installation is manual MCP configuration plus one-time Docker image pull; no hidden installer or persistence mechanism was found.

⚠ Credentials

The container is launched with --network host, --user root, GPU access, and a read-write bind mount of the input file's directory, which is broader than ordinary OCR needs and materially expands blast radius.

⚠ Persistence & Privilege

The container is ephemeral, but it runs as root and mounts the host directory read-write; the inline Python command also embeds the image filename without escaping, creating a code-injection path for crafted filenames.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install paddle-ocr-vl
安装完成后，直接呼叫该 Skill 的名称或使用 /paddle-ocr-vl 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

Initial release of PaddleOCR-VL skill: - GPU-accelerated document OCR using PaddleOCR-VL inside Docker. - Detects layout and recognizes Chinese/English text, tables, charts, and seals. - Supports vertical classical Chinese text, modern newspaper layouts, and mixed-content documents. - Provides tools to run OCR, check environment, and demo on sample images. - No external endpoints; all processing is local and containerized for privacy and security.

元数据

Slug paddle-ocr-vl

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

PaddleOCR-VL 是什么？

GPU-accelerated document parsing and OCR via PaddleOCR-VL. Detects layout, recognizes Chinese/English text, tables, charts, and seals in images. Use when the... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 0 次。

如何安装 PaddleOCR-VL？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install paddle-ocr-vl」即可一键安装，无需额外配置。

PaddleOCR-VL 是免费的吗？

是的，PaddleOCR-VL 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

PaddleOCR-VL 支持哪些平台？

PaddleOCR-VL 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 PaddleOCR-VL？

由 Jessy-Huang（@jessy-huang）开发并维护，当前版本 v1.0.0。