← Back to Skills Marketplace
purvik6062

Ca File Processor

by purvik6062 · GitHub ↗ · v1.0.3 · MIT-0
cross-platform ✓ Security Clean
141
Downloads
0
Stars
0
Active Installs
4
Versions
Install in OpenClaw
/install ca-file-processor
Description
Process financial documents for Indian CA firms. Use when any PDF, Excel (.xlsx/.xls), CSV, JPG, or PNG file is received or uploaded — including GST returns,...
README (SKILL.md)

CA File Processor

This skill processes the four most common file formats used by Indian CA firms and extracts structured information from them for analysis, summarisation, and answering queries.

Supported formats

  • PDF — GST returns, ITR acknowledgements, audit reports, scanned invoices (text-layer and scanned via OCR)
  • Excel (.xlsx / .xls) — Trial balance, P&L, balance sheets, payroll registers, GST workings
  • CSV — Bank statement exports (HDFC, ICICI, SBI), GSTR-2B downloads, Tally exports
  • Images (.jpg / .png) — WhatsApp invoice photos, scanned Form 16, cheque images

How to use

When a file is attached or uploaded, run the appropriate script:

python3 scripts/skill_router.py \x3Cfile_path>

The router auto-detects the file type and calls the correct processor. It returns a structured JSON dict.

What to do with the output

Once the script returns output, use it to:

  1. Answer the user's question about the document
  2. Extract specific fields they asked for (GSTIN, totals, dates)
  3. Summarise the document in plain language
  4. Flag anomalies or missing information
  5. Compare figures across multiple documents

Field extraction — what gets detected automatically

For invoices and PDFs:

  • GSTIN (supplier and recipient)
  • Invoice number and date
  • Total amount / grand total
  • PAN number
  • Email and phone

For bank statements (CSV):

  • Total debits and credits
  • Date range of transactions
  • Detected bank format

For Excel files:

  • Document type (trial balance / P&L / balance sheet / payroll / GST workings / ledger)
  • Sheet names and row counts
  • Preview of header rows

OCR notes

  • Text-layer PDFs are read directly (fast, accurate)
  • Scanned PDFs and images go through Tesseract OCR (English + Hindi)
  • Confidence is rated high / medium / low in the output
  • Always flag low-confidence results to the user and ask for confirmation on numeric fields

Trust statement

This skill runs entirely locally on your server. No data is sent to any external service. All processing happens via open-source Python libraries (PyMuPDF, pytesseract, openpyxl, pandas).

Usage Guidance
This skill appears coherent and operates locally, but take standard precautions before installing/using it: 1) Install system deps (tesseract, poppler) and pip packages in an isolated environment (virtualenv/container). 2) Review/upgrade pinned dependencies for known vulnerabilities. 3) Test on non-sensitive sample files first to confirm behavior. 4) Because it processes sensitive financial documents, run it on a trusted machine or inside a restricted environment to avoid accidental data exposure. 5) The skill returns extracted text and fields — ensure downstream handling (LLM, logs) is secure and that you do not inadvertently forward sensitive data to external services.
Capability Assessment
Purpose & Capability
Name, description, and included scripts (router, pdf, image, excel, csv) align with a local CA document processing skill. Required binaries (python3, tesseract) and Python libraries match the declared functionality (OCR, PDF/excel/csv parsing).
Instruction Scope
SKILL.md and the scripts only reference local file processing, reading the provided file path and returning structured JSON. There are no instructions to read unrelated system files, environment secrets, or to send data to external endpoints.
Install Mechanism
No automated install spec is provided (instruction-only), but a requirements.txt and system dependency notes are included. This is reasonable for a local Python skill; user must manually install pip deps and system packages (tesseract, poppler). Pinning of specific package versions is normal but should be reviewed for known CVEs before deployment.
Credentials
The skill requests no environment variables or credentials. It only needs local binaries (tesseract) and reads files provided to it. There are no unexpected secret access patterns.
Persistence & Privilege
always:false and default invocation settings. The skill does not attempt to modify other skills or system-wide configs. It runs on-demand against supplied files.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install ca-file-processor
  3. After installation, invoke the skill by name or use /ca-file-processor
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.3
Update skill.md
v1.0.2
Change format
v1.0.1
Change version
v1.0.0
Initial release of CA File Processor. - Supports automated processing of PDF, Excel (.xlsx/.xls), CSV, JPG, and PNG files commonly used by Indian CA firms. - Extracts key fields (GSTIN, invoice number, totals, dates, etc.) and tables from documents. - Auto-detects file type and routes to the correct extraction script. - Includes OCR support for scanned PDFs and images (English + Hindi). - Outputs structured JSON for easy analysis, summarisation, and answering user queries. - All processing is done locally for privacy; no data is sent externally.
Metadata
Slug ca-file-processor
Version 1.0.3
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 4
Frequently Asked Questions

What is Ca File Processor?

Process financial documents for Indian CA firms. Use when any PDF, Excel (.xlsx/.xls), CSV, JPG, or PNG file is received or uploaded — including GST returns,... It is an AI Agent Skill for Claude Code / OpenClaw, with 141 downloads so far.

How do I install Ca File Processor?

Run "/install ca-file-processor" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Ca File Processor free?

Yes, Ca File Processor is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Ca File Processor support?

Ca File Processor is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Ca File Processor?

It is built and maintained by purvik6062 (@purvik6062); the current version is v1.0.3.

💬 Comments