← Back to Skills Marketplace
smart_ocr
by
duykhangdangzn1
· GitHub ↗
· v1.0.0
1304
Downloads
0
Stars
11
Active Installs
1
Versions
Install in OpenClaw
/install smar
Description
Extract text from images and scanned documents using PaddleOCR - supports 100+ languages
Usage Guidance
This skill appears to really be an OCR helper using PaddleOCR, but it is instruction-only and does not declare how to install the large native Python packages and model files it needs. Before installing or enabling it: 1) confirm whether your agent/runtime already has paddlepaddle, paddleocr, pdf2image, and poppler (and GPU drivers if you plan to use GPU); otherwise the agent may attempt ad-hoc installs or downloads at runtime; 2) be aware that the SKILL.md allows the agent to fetch URLs and read/write temporary files — avoid processing sensitive documents unless you trust the runtime and network controls; 3) ask the skill author for an explicit install spec (pinned package versions, model download sources) or prefer a vetted skill that provides a safe, reproducible install; 4) if you must use it, run it in a sandboxed environment with restricted network and filesystem access and monitor any package downloads or subprocess executions.
Capability Analysis
Type: OpenClaw Skill
Name: smar
Version: 1.0.0
The skill bundle is classified as benign. It provides functionality for Optical Character Recognition (OCR) using the PaddleOCR library. While the skill utilizes powerful tools like `code_execution`, `file_operations`, and network access via `requests` (for fetching images from URLs), these capabilities are explicitly declared, necessary for the skill's stated purpose, and demonstrated through standard, non-malicious code snippets. There is no evidence of intentional harmful behavior such as data exfiltration, malicious execution, persistence mechanisms, obfuscation, or prompt injection instructions designed to subvert the agent's behavior beyond its intended function.
Capability Assessment
Purpose & Capability
Name and description match the SKILL.md content: the instructions show how to run PaddleOCR on images and PDFs and return structured text. However, PaddleOCR (and its runtime, paddlepaddle, pdf2image/poppler, model files) are non-trivial dependencies that are not declared in registry metadata or an install spec. That omission makes the capability incomplete: a legitimate smart_ocr implementation would normally declare installation steps or required binaries.
Instruction Scope
The SKILL.md stays on-topic: it describes initializing PaddleOCR, converting PDFs to images, taking input from paths, bytes, or URLs, and returning bounding boxes/confidence. It does instruct the agent to fetch URLs (requests.get) and to write/delete temporary image files, which are appropriate for OCR and clearly described.
Install Mechanism
There is no install spec despite clear dependence on Python packages and native tools (paddlepaddle, paddleocr, pdf2image, poppler, possibly GPU drivers). Because the skill is instruction-only, the agent or environment would need to supply these at runtime — that can lead to ad-hoc installs or network downloads not controlled or declared by the skill. A legitimate skill should either declare required binaries/packages or provide an install block.
Credentials
The skill requests no environment variables or credentials and doesn't reference unrelated config paths. It does request use of tools like code_execution and file_operations (declared in the SKILL.md header), which is proportionate to OCR tasks. Still, code_execution gives an agent broad powers, so the lack of an install spec combined with execution/file access increases the operational risk.
Persistence & Privilege
The skill has no always:true flag and does not request persistence or modifications to other skills or system-wide config. It appears to be user-invocable only and does not demand elevated platform privileges.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install smar - After installation, invoke the skill by name or use
/smar - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Smart OCR Skill v1.0.0
- Initial release of smart-ocr skill for extracting text from images and scanned documents using PaddleOCR.
- Supports over 100 languages with high accuracy, including detailed language selection options.
- Includes best practices for image preprocessing, language choice, batch processing, and filtering low-confidence results.
- Provides Python code snippets for processing images, scanned PDFs, URLs, and bytes.
- Offers examples of advanced usage such as layout reconstruction and batch OCR with progress feedback.
Metadata
Frequently Asked Questions
What is smart_ocr?
Extract text from images and scanned documents using PaddleOCR - supports 100+ languages. It is an AI Agent Skill for Claude Code / OpenClaw, with 1304 downloads so far.
How do I install smart_ocr?
Run "/install smar" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is smart_ocr free?
Yes, smart_ocr is completely free (open-source). You can download, install and use it at no cost.
Which platforms does smart_ocr support?
smart_ocr is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created smart_ocr?
It is built and maintained by duykhangdangzn1 (@duykhangdangzn1); the current version is v1.0.0.
More Skills