← Back to Skills Marketplace
9penny

General OCR Struct

by JY · GitHub ↗ · v0.1.0 · MIT-0
cross-platform ✓ Security Clean
384
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install general-ocr-struct
Description
Offline OCR extracting and structuring Chinese/English screenshot text into raw or cleaned rows and fields for receipts, tables, and statements.
README (SKILL.md)

General OCR Struct

Use this skill to separate OCR recognition from downstream content整理.

Workflow

  1. Run the local OCR script on the image first.
  2. Return the raw OCR text before making business interpretations when accuracy matters.
  3. If the image is a transaction-detail screenshot, run structuring mode to group rows into fields.
  4. Mark uncertain fields explicitly as 待确认; do not guess missing content.
  5. Only after the user confirms recognition quality, use the result for tables, summaries, or documents.

Commands

Raw OCR

python3 scripts/general_ocr.py raw /path/to/image.jpg

Structured transaction extraction

python3 scripts/general_ocr.py transactions /path/to/image.jpg

JSON output

python3 scripts/general_ocr.py transactions /path/to/image.jpg --json

Output rules

  • Prefer showing the recognition result first, then the cleaned structure.
  • Preserve source wording where possible.
  • For uncertain content, use 待确认 instead of inferring.
  • Adapt the structure to the source image type. For statement-like screenshots, common fields are: card_last4, date, time, currency, merchant, amount.

Notes

  • This skill uses RapidOCR locally.
  • First install may need Python packages; after setup it runs offline.
  • If OCR quality is weak, request a higher-resolution original screenshot before doing deeper整理.
Usage Guidance
This skill appears coherent and runs OCR locally, but you should: (1) confirm rapidocr_onnxruntime is installed from a trusted source (pip/official release) because the script will import and execute that package locally; (2) verify your host's RapidOCR runtime does not auto-download models or call the network if you require strictly offline operation; (3) only run the script on images you are comfortable processing (it will read the image file you pass); and (4) test it in a controlled environment before using with sensitive financial or personal documents to validate the heuristics and ensure no unexpected behavior.
Capability Analysis
Type: OpenClaw Skill Name: general-ocr-struct Version: 0.1.0 The skill provides local OCR and transaction data structuring using the RapidOCR library. Analysis of scripts/general_ocr.py shows safe path handling via pathlib and purely heuristic-based data processing (regex) without any network activity, credential harvesting, or suspicious execution patterns. The instructions in SKILL.md and references.md are strictly aligned with the stated purpose of image-to-text conversion and do not contain any malicious prompt injection or exfiltration commands.
Capability Assessment
Purpose & Capability
Name/description (offline OCR + structuring) match the included script and SKILL.md. The script runs RapidOCR locally, extracts lines, and heuristically structures transaction-like rows—behavior aligns with the stated purpose.
Instruction Scope
SKILL.md instructs only local usage of the provided Python script on user-supplied image paths, returning OCR text or structured transactions. The instructions do not ask the agent to read unrelated files, environment variables, or send data externally.
Install Mechanism
This is an instruction-only skill (no install spec). The Python script depends on the third‑party package rapidocr_onnxruntime; SKILL.md mentions installing Python packages but the registry entry does not provide an automated install step. This is low risk but requires the host to have a trusted RapidOCR runtime installed.
Credentials
No environment variables, credentials, or config paths are requested. The script only reads the image path provided by the caller. No disproportionate access is requested.
Persistence & Privilege
The skill is not always-enabled and does not modify other skills or system settings. It operates only when invoked and has no elevated persistence requirements.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install general-ocr-struct
  3. After installation, invoke the skill by name or use /general-ocr-struct
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v0.1.0
Initial release: local RapidOCR-based OCR with raw extraction and transaction-style structuring workflow.
Metadata
Slug general-ocr-struct
Version 0.1.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is General OCR Struct?

Offline OCR extracting and structuring Chinese/English screenshot text into raw or cleaned rows and fields for receipts, tables, and statements. It is an AI Agent Skill for Claude Code / OpenClaw, with 384 downloads so far.

How do I install General OCR Struct?

Run "/install general-ocr-struct" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is General OCR Struct free?

Yes, General OCR Struct is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does General OCR Struct support?

General OCR Struct is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created General OCR Struct?

It is built and maintained by JY (@9penny); the current version is v0.1.0.

💬 Comments