← Back to Skills Marketplace
91
Downloads
0
Stars
0
Active Installs
2
Versions
Install in OpenClaw
/install ccy-ocr-local
Description
本地离线 OCR 技能。对本机图片做文字识别,默认不上传文件、不依赖外部 API Key。适用于截图、文档拍照、扫描件、相机图片中的中英文文本提取。
README (SKILL.md)
ccy-ocr-local
使用本技能在 本机离线 对图片做 OCR 识别,优先用于:
- 从截图提取文字
- 从文档拍照/扫描件提取文字
- 从相机图片中提取中英文文本
- 需要避免上传文件到外部服务的 OCR 场景
何时使用
当用户需要:
- 识别本地图片里的文字
- 不想依赖云端 OCR / API Key
- 在 Jetson / Linux 本机直接跑 OCR
- 先用轻量依赖快速得到可用文本
若追求复杂版面、票据、表格结构、竖排文本或更强中文效果,优先考虑后续单独做 PaddleOCR / RapidOCR 本地技能。
入口
脚本入口:scripts/local_ocr.py
配套脚本:
scripts/benchmark.py:对样例图比较balanced / fast / accurate的耗时和输出长度scripts/regression.py:对样例图生成基线输出,便于回归检查
这轮新增能力
- 自动方向检测:可用
--autorotate尝试方向并选择更优结果 - 更省时的自动旋转策略:默认
--autorotate-strategy smart,先按宽高比做轻判,只有结果太弱才退回全方向检查 - 批量输出防重名:批量模式输出时保留相对目录结构,避免不同子目录同名图片互相覆盖
- JSON 输出:可用
--json输出 OCR 内容和元数据,适合自动化接入
依赖要求
最小依赖:
python3tesseract- Python 包:
Pillow、pytesseract
可选增强:
opencv-python- 装了则使用 OpenCV 预处理,通常准确率更稳
- 没装也能运行,会自动退回 Pillow 流程
- Tesseract 语言包:
eng、chi_sim
Windows 兼容说明:
- 若当前终端或 agent 没继承到系统 PATH,脚本会尝试自动探测常见安装位置,例如:
C:/Program Files/Tesseract-OCR/tesseract.exe
- 也可以手动指定:
--tesseract-cmd "C:/Program Files/Tesseract-OCR/tesseract.exe"- 或设置环境变量
TESSERACT_CMD
默认设计取向
这个技能默认在四个方向做平衡:
- 准确率:自动纠正 EXIF 方向、灰度化、自动对比度增强、小图放大、二值化
- 速度:默认
balanced模式只做一次识别;需要时才尝试多 PSM - 资源占用:预处理保持轻量,不引入更重的 OCR 框架
- 减少依赖:OpenCV 是可选项,不是硬依赖
参数
image_path:本地图片路径;配合--batch时也可传目录--lang:OCR 语言,默认eng- 常见值:
eng、chi_sim、chi_sim+eng
- 常见值:
--psm:显式指定 Tesseract PSM;指定后不再自动试多个模式--mode:balanced/fast/accuratebalanced:默认,单次识别,资源最省fast:自动试少量常见 PSM,兼顾速度accurate:自动试更多 PSM,优先提高命中率
--format:text或tsv--tesseract-cmd:显式指定 Tesseract 可执行文件路径,适合 Windows / PATH 未继承场景--min-conf:TSV 模式下过滤低置信度文本--dpi:传给 Tesseract 的逻辑 DPI,默认300--min-edge:小图放大的目标长边,默认1800--sharpen:启用轻量锐化,适合略糊的图--no-preprocess:关闭基础预处理--out:单图模式下将结果写入文件--batch:批量处理模式--recursive:批量模式下递归扫描子目录--out-dir:批量模式下输出目录,并生成manifest.json--autorotate:自动尝试方向,适合拍照方向不稳的图片--autorotate-strategy:smart或full,默认smart
输出
默认输出:
text:纯文本
可选输出:
tsv:带位置和置信度的结构化文本,适合后处理json:包含文本和元数据(耗时、PSM、旋转角度、模式等),适合自动化流水线- 批量模式下可输出每张图的文本/TSV/JSON 文件和总清单
manifest.json manifest.json中会记录每张图的耗时、PSM、旋转角度和输出路径
错误输出:
- 图片不存在或无法打开
- Tesseract 未安装
- 指定语言数据不存在
常用示例
最小示例
python3 skills/ccy-ocr-local/scripts/local_ocr.py /path/to/image.png
中英混合识别
python3 skills/ccy-ocr-local/scripts/local_ocr.py /path/to/image.png --lang chi_sim+eng
提高准确率
python3 skills/ccy-ocr-local/scripts/local_ocr.py /path/to/image.png --lang chi_sim+eng --mode accurate --sharpen --autorotate
更快一点
python3 skills/ccy-ocr-local/scripts/local_ocr.py /path/to/image.png --mode fast
Windows 显式指定 Tesseract
python skills/ccy-ocr-local/scripts/local_ocr.py C:/path/to/image.png --tesseract-cmd "C:/Program Files/Tesseract-OCR/tesseract.exe"
导出结构化 TSV
python3 skills/ccy-ocr-local/scripts/local_ocr.py /path/to/image.png --format tsv --min-conf 40 --out /tmp/result.tsv
JSON 输出
python3 skills/ccy-ocr-local/scripts/local_ocr.py /path/to/image.png --lang chi_sim+eng --autorotate --json
批量处理目录
python3 skills/ccy-ocr-local/scripts/local_ocr.py /path/to/images --batch --recursive --lang chi_sim+eng --out-dir /tmp/ocr-batch
批量 JSON 输出
python3 skills/ccy-ocr-local/scripts/local_ocr.py /path/to/images --batch --recursive --lang chi_sim+eng --autorotate --json --out-dir /tmp/ocr-batch-json
跑 benchmark
python3 skills/ccy-ocr-local/scripts/benchmark.py
跑回归样例
python3 skills/ccy-ocr-local/scripts/regression.py
使用建议
- 普通截图/UI 文本:先用默认参数
- 中英混合材料:
--lang chi_sim+eng - 拍照略糊:加
--sharpen - 拍照方向不稳:加
--autorotate - 批量自动化接入:加
--json或配合--out-dir - 想省资源:保持
balanced,不要开accurate - 要后处理:使用
--format tsv - 多张图流水线处理:使用
--batch --out-dir
已知限制
- 普通 Tesseract 对 复杂排版、公式、表格结构、试卷版面 支持一般
- 若系统没有安装
chi_sim.traineddata,中文识别会失败或效果较差 - 对严重透视变形、阴影、强反光图片,建议先做文档矫正/裁切
- 本技能是轻量本地 OCR,不追求最强中文复杂场景表现
Usage Guidance
This skill appears to be a straightforward local OCR wrapper around Tesseract and optional OpenCV. Before installing/use: 1) Ensure you have Tesseract (and needed language data like chi_sim) installed locally; the script invokes the tesseract binary via subprocess. 2) Be aware outputs and manifest files will record file paths and the resolved tesseract executable path (useful for debugging but may reveal local paths). 3) The code runs external binaries (tesseract) and processes files you point it at—avoid pointing it to sensitive system directories. 4) I could not see the very end of scripts/local_ocr.py in the provided bundle (truncated); if you want maximum assurance, review the remainder of that file to confirm there are no unexpected network calls or data-sending behaviors. 5) For added safety when trying it out, run in a sandbox or on non-sensitive images.
Capability Analysis
Type: OpenClaw Skill
Name: ccy-ocr-local
Version: 1.0.1
The skill bundle provides a legitimate local OCR utility leveraging Tesseract. The core logic in `scripts/local_ocr.py` handles image preprocessing via PIL or OpenCV and executes OCR through the `pytesseract` wrapper. While it uses `subprocess` to interact with the Tesseract binary and the Python interpreter (in `benchmark.py` and `regression.py`), it does so safely using argument lists rather than shell strings. There is no evidence of data exfiltration, network activity, or malicious prompt injection in the documentation.
Capability Tags
Capability Assessment
Purpose & Capability
The name/description (local offline OCR) matches the included scripts. The code calls Tesseract via pytesseract, optionally uses OpenCV, processes local image files, and provides JSON/TSV/text outputs; none of the required env vars, binaries, or files are unrelated to OCR.
Instruction Scope
SKILL.md instructs running scripts in-place and passing local image paths. The scripts operate on local files, detect/verify Tesseract, preprocess images, run OCR, and write local outputs and manifests. There are no instructions to read unrelated system state, collect extra secrets, or transmit data to remote endpoints.
Install Mechanism
There is no install spec (instruction-only deployment), and the code relies on system-installed Python packages and the Tesseract binary. No external download/extract steps are present in the repository.
Credentials
The skill does not request credentials or secret environment variables. It optionally reads TESSERACT_CMD and standard ProgramFiles env vars to locate tesseract, which is appropriate for locating the OCR binary. Output manifests include input paths and the resolved tesseract path (expected for diagnostics).
Persistence & Privilege
The skill is not always-enabled, does not request persistent privileges, and does not modify other skills or global agent configuration. It runs as a normal local script invoked by the agent/user.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install ccy-ocr-local - After installation, invoke the skill by name or use
/ccy-ocr-local - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.1
Improve local OCR: autorotate, smart autorotate strategy, JSON output, safer batch output paths, benchmark/regression scripts, docs refresh
v1.0.0
- 首次发布厦门超安本地 OCR 技能,支持在本机离线对图片文字进行识别。
- 完全免费,无需 API Key,无需上传文件,保护用户隐私。
- 支持中英文文本提取,适用于截图、拍照、扫描件等多种图片类型。
- 兼容 Jetson 和 Linux 环境,适合嵌入式及本地部署需求。
- 最小依赖仅需 Python3、Tesseract、Pillow、pytesseract,可选增强 cv2 支持。
- 默认输出纯文本结果,便于后续处理与集成。
Metadata
Frequently Asked Questions
What is CCY OCR Local?
本地离线 OCR 技能。对本机图片做文字识别,默认不上传文件、不依赖外部 API Key。适用于截图、文档拍照、扫描件、相机图片中的中英文文本提取。 It is an AI Agent Skill for Claude Code / OpenClaw, with 91 downloads so far.
How do I install CCY OCR Local?
Run "/install ccy-ocr-local" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is CCY OCR Local free?
Yes, CCY OCR Local is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does CCY OCR Local support?
CCY OCR Local is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created CCY OCR Local?
It is built and maintained by 超控中国 (@chenchongyong); the current version is v1.0.1.
More Skills