Description

Scan photographed documents into searchable PDFs with OCR and stable file naming. Use when the user sends document photos and asks to scan, save, archive, OC...

README (SKILL.md)

Daily Scan

Name: daily-scan
Author: kimjoohyeon-wq

Overview

Turn phone photos of documents into searchable PDFs with OCR and stable filenames based on capture date and headline text. Preserve the original photo, generate a readable scan-like PDF, and support later retrieval of saved scan files.

Runtime Requirements

Default OCR path: local Tesseract CLI
Required local dependencies for the stable path:
- tesseract
- Python packages used by bundled scripts: opencv-python or cv2, Pillow, and either reportlab or ocrmypdf depending on the active PDF path
Optional/experimental OCR path:
- PaddleOCR-based script exists but is not the default stable engine
No cloud upload is required for core operation
The skill assumes bundled helper scripts under scripts/ are present and callable by the host agent

Workflow

Confirm the trigger.

스캔 / scan — one-page processing
스캔연속 / scan multi — combine multiple photos into one searchable PDF
스캔찾아 / scan find — search previously saved scan files

Collect attached image files or search keywords.
Apply document-style cleanup when possible.

straighten or rotate when needed
improve contrast for readability
keep output practical rather than over-processed

Run OCR in Korean and English.
Build the filename as:

YYYY-MM-DD + headline text
derive headline text from the top 2 to 3 OCR lines

Create a searchable PDF.
Save output to the local storage destination.
Keep the original image with the processed result.
For retrieval requests, search by date, headline text, or OCR keyword in the configured scan storage path.
Return:

filename
save location
OCR title line

Storage Rules

Default local staging/search path: daily-scan-storage/YYYY-MM
This skill is designed for local scan creation and retrieval only
Use year/month folder structure
Do not auto-classify document types

Operating Rules

For multi-page capture, combine pages into one PDF only when the trigger is 스캔연속 or scan multi
OCR language defaults to Korean plus English
Retrieval requests should search existing saved scan outputs before asking follow-up questions
Keep replies concise

Failure Handling

If OCR fails, still save the PDF when possible
If headline extraction fails, ask the user what title to use
If OCR fails, explicitly report that OCR failed
Preserve the original image unless the user later asks otherwise

Current Limits

Korean searchable PDF quality depends on OCR engine quality and PDF text-layer handling
The Tesseract path is the current stable default
The PaddleOCR path is experimental and should not be treated as the default engine
This skill does not require external upload tools or cloud credentials

Output Contract

Return only the practical result:

saved filename
save location
extracted title line when available

Resources

scripts/

Bundled scripts are used for:

image cleanup
OCR execution
searchable PDF generation
saved scan retrieval

references/

Store implementation notes for OCR engine choice and filename normalization if the skill grows more complex.

Usage Guidance

This skill appears to do what it claims: local image preprocessing, OCR, PDF creation, and local search. Before installing/using: 1) Ensure tesseract and either ocrmypdf or reportlab + dependencies are installed on the host (ocrmypdf may require ghostscript); 2) Be aware the experimental PaddleOCR path can automatically download model weights (outbound network activity) — disable or sandbox if you require strict offline operation; 3) Validate storage location and file permissions (daily-scan-storage) to avoid accidentally exposing sensitive images; 4) Test in a safe environment first to confirm dependencies and behavior; 5) If you need strict no-network guarantees, avoid enabling the PaddleOCR path or pre-install its models offline.

Capability Analysis

Type: OpenClaw Skill Name: daily-scan Version: 1.0.4 The skill is designed to convert document photos into searchable PDFs using local OCR engines (Tesseract or PaddleOCR). The bundled Python scripts (build_searchable_pdf.py, build_searchable_pdf_paddle.py, and search_scans.py) perform image enhancement, OCR execution via subprocess calls, and local file management without any network activity or data exfiltration. The instructions in SKILL.md are consistent with the code's functionality and do not contain any prompt injection attempts or malicious directives.

Capability Assessment

✓ Purpose & Capability

Name/description describe turning photos into searchable PDFs. Bundled scripts perform image cleanup, OCR (Tesseract or PaddleOCR), PDF assembly, and local search; the requested local binaries and libraries (tesseract, ocrmypdf or reportlab, OpenCV, Pillow) are consistent with that purpose.

ℹ Instruction Scope

SKILL.md limits actions to local processing, saving, and searching. The scripts follow that: they read local image files, write PDFs into daily-scan-storage, and run local OCR. Notes: build_searchable_pdf.py calls external CLIs (tesseract, ocrmypdf) via subprocess (arguments passed as lists, avoiding shell interpolation). The PaddleOCR path may implicitly download model weights at runtime (not explicitly declared in SKILL.md) — this could cause outbound network activity on first use.

✓ Install Mechanism

No install spec (instruction-only) — lower risk. The skill bundles Python scripts but does not itself download or install remote code. Host must ensure required binaries/packages are present; installing those may pull external packages (pip, system packages), but that is outside the skill bundle.

✓ Credentials

The skill requests no environment variables, credentials, or config paths. All file I/O is local and limited to the configured storage path. No unrelated secrets or services are requested.

✓ Persistence & Privilege

always:false and no code modifies other skills or global agent settings. The skill only writes its own output under daily-scan-storage and does not request persistent elevated privileges.

Version History

v1.0.4

- Removed the optional Google Drive upload script (scripts/upload_to_drive.py). - Updated documentation to clarify that cloud upload is not needed or supported. - Emphasized local-only scan creation, storage, and retrieval in SKILL.md. - Minor cleanup of language and requirements for clarity.

v1.0.3

Version 1.0.3 of "daily-scan" - No code or documentation changes detected. - Functionality, workflow, and usage remain unchanged from the previous version.

v1.0.2

- Clarified and expanded runtime requirements and dependencies, including required CLI tools and Python packages. - Added explicit details about local vs. optional cloud storage and integration. - Documented the assumption of bundled helper scripts for all main operations. - Improved organization of storage, operating, and failure handling rules. - Aligned trigger descriptions and workflow steps for clarity and precision. - Updated limits and output contract sections for clearer expectations.

v1.0.1

- SKILL.md updated to add a new "Current Limits" section clarifying OCR engine defaults and PDF font handling. - Mentioned Tesseract as the stable default OCR engine and PaddleOCR as experimental. - No code or logic changes; documentation only. - File was renamed from skill.md to SKILL.md for consistent naming.

v1.0.0

Initial release of daily-scan skill. - Convert photographed documents into searchable PDFs with OCR in Korean and English. - Support single-page and multi-page scanning, with stable filenames based on date and headline text. - Enhance images for readability and keep original photos with scan outputs. - Enable search and retrieval of previous scans by date, headline, or keywords. - Save scans locally in a year/month folder structure. - Report back with saved filename, location, and extracted title line. - Korean and English trigger words supported: `스캔`/`scan`, `스캔연속`/`scan multi`, `스캔찾아`/`scan find`. - The scanning engine still needs significant improvement. Contributions and help are welcome — especially around OCR accuracy, image preprocessing, and multi-page handling.

Metadata

Slug daily-scan

Version 1.0.4

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 5

Frequently Asked Questions

What is daily-scan?

Scan photographed documents into searchable PDFs with OCR and stable file naming. Use when the user sends document photos and asks to scan, save, archive, OC... It is an AI Agent Skill for Claude Code / OpenClaw, with 122 downloads so far.

How do I install daily-scan?

Run "/install daily-scan" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is daily-scan free?

Yes, daily-scan is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does daily-scan support?

daily-scan is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created daily-scan?

It is built and maintained by kimjoohyeon-wq (@kimjoohyeon-wq); the current version is v1.0.4.

More Skills

daily-scan