PaddleOCR-VL
/install paddle-ocr-vl
PaddleOCR-VL Skill
GPU-accelerated document OCR using the PaddleOCR-VL model running inside an ephemeral Docker container. Auto-detects NVIDIA GPU architecture (Blackwell SM120 vs. standard) and selects the correct official image.
When to Use
- User provides an image path and asks to "read", "OCR", or "extract text" from it
- User wants to parse a document screenshot, newspaper page, or classical text
- User asks about the content of an image file
Architecture
This skill includes an MCP server (server.py) that exposes three tools:
| Tool | Purpose |
|---|---|
run_ocr |
OCR any image — provide an absolute path |
check_environment |
Verify Docker, GPU drivers, and image are ready |
run_demo |
Run OCR on bundled demo images to test the setup |
Setup
1. Install the MCP Server
Add to ~/.config/Claude/claude_desktop_config.json (Claude Desktop)
or ~/.claude/settings.json (Claude Code):
{
"mcpServers": {
"paddle-ocr-vl": {
"command": "python3",
"args": ["\x3CINSTALL_DIR>/server.py"]
}
}
}
2. Pull the Docker Image (one-time)
# Blackwell GPU (RTX 50xx, B100/B200 — compute capability >= 12.0):
docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddleocr-vl:latest-nvidia-gpu-sm120
# Other NVIDIA GPU:
docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddleocr-vl:latest-nvidia-gpu
3. Verify
Call check_environment to verify everything is set up, then run_demo to
test on the bundled sample images.
Bundled Demo Images
| File | Content |
|---|---|
demo/newspaper.png |
People's Daily article about China-Eritrea relations |
demo/classical_text.png |
Records of the Three Kingdoms, vertical classical Chinese |
Requirements
- Docker with
nvidia-container-toolkit - NVIDIA GPU with drivers installed
- Python 3.10+ with
mcpSDK (pip install mcp)
Security & Privacy
- Images are processed inside an ephemeral Docker container (
--rmflag) - The container has no network access beyond
--network host(needed for GPU) - No data leaves the host machine
- The container is destroyed immediately after each OCR run
External Endpoints
None. All processing is local.
Official References
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install paddle-ocr-vl - 安装完成后,直接呼叫该 Skill 的名称或使用
/paddle-ocr-vl触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
PaddleOCR-VL 是什么?
GPU-accelerated document parsing and OCR via PaddleOCR-VL. Detects layout, recognizes Chinese/English text, tables, charts, and seals in images. Use when the... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 0 次。
如何安装 PaddleOCR-VL?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install paddle-ocr-vl」即可一键安装,无需额外配置。
PaddleOCR-VL 是免费的吗?
是的,PaddleOCR-VL 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
PaddleOCR-VL 支持哪些平台?
PaddleOCR-VL 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 PaddleOCR-VL?
由 Jessy-Huang(@jessy-huang)开发并维护,当前版本 v1.0.0。