← Back to Skills Marketplace
bobholamovic

PaddleOCR Text Recognition

by Bobholamovic · GitHub ↗ · v1.0.21 · MIT-0
cross-platform ✓ Security Clean
2355
Downloads
12
Stars
18
Active Installs
22
Versions
Install in OpenClaw
/install paddleocr-text-recognition
Description
Use this skill whenever the user wants text extracted from images, photos, scans, screenshots, or scanned PDFs. Returns exact machine-readable strings with l...
Usage Guidance
This skill appears to do what it says: a small Python CLI wrapper that calls a remote PaddleOCR endpoint. Before installing, confirm the PADDLEOCR_OCR_API_URL points to a trusted service and only provide a token you intend to use for OCR. Be aware that by default the tool saves raw JSON results (recognized text and provider response) to the system temp directory — do not run on sensitive images unless you are comfortable that the remote OCR service and the local temp file storage are acceptable. Use --stdout to avoid persisting results to disk. If you need offline/local-only OCR, this remote-API approach is not suitable.
Capability Analysis
Type: OpenClaw Skill Name: paddleocr-text-recognition Version: 1.0.21 The skill is a legitimate OCR tool wrapper for the PaddleOCR API, allowing users to extract text from images and PDFs via a user-provided API endpoint and token. The implementation in lib.py and ocr_caller.py follows standard practices, including enforcing HTTPS for API communication, robust error handling, and transparent result saving to the system's temporary directory. No indicators of malicious intent, unauthorized data exfiltration, or harmful prompt injection were found; the instructions in SKILL.md appropriately guide the agent on tool usage and result presentation.
Capability Assessment
Purpose & Capability
Name/description match the included Python CLI wrapper and library. The required binary (uv) and environment variables (PADDLEOCR_OCR_API_URL, PADDLEOCR_ACCESS_TOKEN) are appropriate for a remote OCR client that posts images/base64 to an OCR endpoint.
Instruction Scope
SKILL.md stays within OCR scope and documents running the included scripts, file/URL inputs, and how to parse saved JSON. Note: the default behavior is to save raw JSON results to a system temp directory and print the saved path; this persists recognized text (and potentially raw images or metadata) to disk unless --stdout is used.
Install Mechanism
Instruction-only skill with no install spec; scripts declare Python dependencies to be resolved by uv. No external download URL or archive extraction is used.
Credentials
Only two env vars are required and they directly map to the remote OCR API (API URL and access token). This is proportionate. User should be aware that providing these credentials gives the skill the ability to call the remote OCR service and send user files (including base64-encoded local files) to that endpoint.
Persistence & Privilege
always is false and the skill does not request elevated or permanent system privileges. It only writes result files under the system temp directory (by default) or to a user-specified path.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install paddleocr-text-recognition
  3. After installation, invoke the skill by name or use /paddleocr-text-recognition
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.21
- License information has been removed from the YAML header. - No code or functionality changes; documentation and usage remain consistent with previous versions.
v1.0.20
No changes detected in this version. - Version 1.0.20 was published with no modifications to files or documentation.
v1.0.19
- Switched to inline Python dependency declaration (PEP 723) for uv compatibility; no requirements.txt file needed. - Updated installation instructions to use uv for zero-config dependency management. - Minor metadata improvements: added license, compatibility, and clarified usage notes. - No functional changes to core OCR features or command-line interface.
v1.0.18
- Documentation streamlined: redundant information, especially duplicate and overly detailed first-time configuration steps, has been trimmed for clarity. - API configuration instructions updated to clarify language/model selection steps. - Core usage steps are unchanged; skill usage and command-line examples remain the same. - No code or functional changes; this is a documentation-only update for improved usability.
v1.0.17
No file changes detected for version 1.0.17. - No changes or updates in this release; all documentation and code remain the same.
v1.0.16
No code or documentation changes detected in this version. - No file or documentation updates found between versions. - Functionality and usage remain unchanged.
v1.0.15
- Documentation streamlined for clarity: removed duplication, reorganized sections, and improved formatting for easier reference. - Added performance notes about expected OCR processing times for images and large PDFs. - Clarified criteria for when to use/not use the skill and when to use "Document Parsing" instead. - Made output requirements explicit: display the entire recognized text, avoid truncation, and outlined correct/incorrect examples. - Updated configuration and error handling instructions to be briefer and more direct. - Provided usage examples with file-type explanation and concise guidance on result handling.
v1.0.14
- requirements.txt moved from scripts/ to the top-level skill directory. - Environment variable PADDLEOCR_OCR_TIMEOUT removed from required list; now only API URL and access token are mandatory. - Documentation updated to install requirements from the new path. - Minor metadata and env var requirements cleanup.
v1.0.13
No code changes detected in this release. Skill documentation updated only. - Updated the skill description for improved clarity, detail, and accuracy of supported use cases. - Enhanced trigger terms and usage explanation in the YAML metadata and introductory sections. - No changes to scripts, logic, or runtime behavior. - Safe to use as a documentation/metadata update only.
v1.0.12
No user-facing changes detected in this version. - No changes found between the previous and current version files. - Behavior, guidance, and usage remain the same as in the last release.
v1.0.11
- Added bilingual trigger keywords and routing instructions to the description field for improved discovery (now includes both Chinese and English terms such as OCR, 文字识别, plain text extraction, bbox, etc.). - Updated usage instructions and "When to Use This Skill" section to clarify routing and trigger logic. - No changes to code or functional behavior; only SKILL.md metadata and documentation updated.
v1.0.10
- Improved first-time configuration instructions: users are now guided to get the API URL and token from the official PaddleOCR website (with model and configuration details). - Added specific note that supported model is PP-OCRv5, and clarified environment variable setup guidance (including example for OpenClaw). - Enhanced credential handling: warns users about sharing sensitive data in chat, and recommends secure environment variable configuration. - No functional or code changes; documentation only.
v1.0.9
Version 1.0.9 Changelog - Added explicit installation instructions for required Python dependencies using `pip install -r scripts/requirements.txt`. - Clarified the need to install dependencies before using the skill. - No changes to functionality or script usage—documentation improvements only.
v1.0.8
- Removed the no-longer-needed `scripts/configure.py` script. - Updated documentation in SKILL.md for improved clarity and security guidance. - Enhanced error handling instructions and user security warnings in configuration sections. - Refactored script code and requirements for maintainability. - Added clarifications on parsing, saving, and presenting OCR results.
v1.0.7
## paddleocr-text-recognition 1.0.7 - No code or documentation changes detected in this release. - No user-facing updates or features introduced.
v1.0.6
- Simplified and clarified the skill description for easier understanding. - Updated configuration instructions to assume environment variables are already set, unless an OCR task fails due to configuration issues. - Improved guidance for handling credentials, including stronger recommendations against providing secrets in chat or creating local files. - Removed redundant and overly detailed workflow text for easier use and maintenance. - Error handling and output display guidance remain strict: always show complete OCR results and exact error messages.
v1.0.5
Version 1.0.5 - Updated configuration instructions to recommend secure credential setup via the host application, rather than pasting credentials in chat. - Added explicit security warning if credentials are provided in chat, highlighting that such information may be stored in conversation history. - Clarified environment variable setup steps and emphasized secure configuration. - No functional changes to the skill’s OCR handling or results workflow.
v1.0.4
paddleocr-text-recognition v1.0.4 - Updated environment variable naming: now requires PADDLEOCR_OCR_TIMEOUT instead of PADDLEOCR_TIMEOUT. - Minor documentation improvements in reference links and metadata. - No functional or code changes detected.
v1.0.3
- Added metadata block specifying required environment variables, dependencies, emoji, and homepage URL for improved integration and discoverability. - No functional or workflow changes to the skill itself. - Documentation now reflects explicit environment variable and binary dependencies for clarity.
v1.0.2
- Updated first-time configuration instructions: users are now directed to set required environment variables (API URL and token) in the host application or runtime environment, instead of configuring them in-band via script. - Emphasized NOT to run the skill's configure script or create local `.env` files by default, especially in host-managed environments. - Clarified the workflow for parsing credentials, validation, and when to retry OCR after configuration. - Removed internal metadata from the documentation. - General documentation clean-up and improvements for clarity.
Metadata
Slug paddleocr-text-recognition
Version 1.0.21
License MIT-0
All-time Installs 19
Active Installs 18
Total Versions 22
Frequently Asked Questions

What is PaddleOCR Text Recognition?

Use this skill whenever the user wants text extracted from images, photos, scans, screenshots, or scanned PDFs. Returns exact machine-readable strings with l... It is an AI Agent Skill for Claude Code / OpenClaw, with 2355 downloads so far.

How do I install PaddleOCR Text Recognition?

Run "/install paddleocr-text-recognition" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is PaddleOCR Text Recognition free?

Yes, PaddleOCR Text Recognition is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does PaddleOCR Text Recognition support?

PaddleOCR Text Recognition is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created PaddleOCR Text Recognition?

It is built and maintained by Bobholamovic (@bobholamovic); the current version is v1.0.21.

💬 Comments