← 返回 Skills 市场
purvik6062

Ca File Processor

作者 purvik6062 · GitHub ↗ · v1.0.3 · MIT-0
cross-platform ✓ 安全检测通过
141
总下载
0
收藏
0
当前安装
4
版本数
在 OpenClaw 中安装
/install ca-file-processor
功能描述
Process financial documents for Indian CA firms. Use when any PDF, Excel (.xlsx/.xls), CSV, JPG, or PNG file is received or uploaded — including GST returns,...
使用说明 (SKILL.md)

CA File Processor

This skill processes the four most common file formats used by Indian CA firms and extracts structured information from them for analysis, summarisation, and answering queries.

Supported formats

  • PDF — GST returns, ITR acknowledgements, audit reports, scanned invoices (text-layer and scanned via OCR)
  • Excel (.xlsx / .xls) — Trial balance, P&L, balance sheets, payroll registers, GST workings
  • CSV — Bank statement exports (HDFC, ICICI, SBI), GSTR-2B downloads, Tally exports
  • Images (.jpg / .png) — WhatsApp invoice photos, scanned Form 16, cheque images

How to use

When a file is attached or uploaded, run the appropriate script:

python3 scripts/skill_router.py \x3Cfile_path>

The router auto-detects the file type and calls the correct processor. It returns a structured JSON dict.

What to do with the output

Once the script returns output, use it to:

  1. Answer the user's question about the document
  2. Extract specific fields they asked for (GSTIN, totals, dates)
  3. Summarise the document in plain language
  4. Flag anomalies or missing information
  5. Compare figures across multiple documents

Field extraction — what gets detected automatically

For invoices and PDFs:

  • GSTIN (supplier and recipient)
  • Invoice number and date
  • Total amount / grand total
  • PAN number
  • Email and phone

For bank statements (CSV):

  • Total debits and credits
  • Date range of transactions
  • Detected bank format

For Excel files:

  • Document type (trial balance / P&L / balance sheet / payroll / GST workings / ledger)
  • Sheet names and row counts
  • Preview of header rows

OCR notes

  • Text-layer PDFs are read directly (fast, accurate)
  • Scanned PDFs and images go through Tesseract OCR (English + Hindi)
  • Confidence is rated high / medium / low in the output
  • Always flag low-confidence results to the user and ask for confirmation on numeric fields

Trust statement

This skill runs entirely locally on your server. No data is sent to any external service. All processing happens via open-source Python libraries (PyMuPDF, pytesseract, openpyxl, pandas).

安全使用建议
This skill appears coherent and operates locally, but take standard precautions before installing/using it: 1) Install system deps (tesseract, poppler) and pip packages in an isolated environment (virtualenv/container). 2) Review/upgrade pinned dependencies for known vulnerabilities. 3) Test on non-sensitive sample files first to confirm behavior. 4) Because it processes sensitive financial documents, run it on a trusted machine or inside a restricted environment to avoid accidental data exposure. 5) The skill returns extracted text and fields — ensure downstream handling (LLM, logs) is secure and that you do not inadvertently forward sensitive data to external services.
能力评估
Purpose & Capability
Name, description, and included scripts (router, pdf, image, excel, csv) align with a local CA document processing skill. Required binaries (python3, tesseract) and Python libraries match the declared functionality (OCR, PDF/excel/csv parsing).
Instruction Scope
SKILL.md and the scripts only reference local file processing, reading the provided file path and returning structured JSON. There are no instructions to read unrelated system files, environment secrets, or to send data to external endpoints.
Install Mechanism
No automated install spec is provided (instruction-only), but a requirements.txt and system dependency notes are included. This is reasonable for a local Python skill; user must manually install pip deps and system packages (tesseract, poppler). Pinning of specific package versions is normal but should be reviewed for known CVEs before deployment.
Credentials
The skill requests no environment variables or credentials. It only needs local binaries (tesseract) and reads files provided to it. There are no unexpected secret access patterns.
Persistence & Privilege
always:false and default invocation settings. The skill does not attempt to modify other skills or system-wide configs. It runs on-demand against supplied files.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install ca-file-processor
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /ca-file-processor 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.3
Update skill.md
v1.0.2
Change format
v1.0.1
Change version
v1.0.0
Initial release of CA File Processor. - Supports automated processing of PDF, Excel (.xlsx/.xls), CSV, JPG, and PNG files commonly used by Indian CA firms. - Extracts key fields (GSTIN, invoice number, totals, dates, etc.) and tables from documents. - Auto-detects file type and routes to the correct extraction script. - Includes OCR support for scanned PDFs and images (English + Hindi). - Outputs structured JSON for easy analysis, summarisation, and answering user queries. - All processing is done locally for privacy; no data is sent externally.
元数据
Slug ca-file-processor
版本 1.0.3
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 4
常见问题

Ca File Processor 是什么?

Process financial documents for Indian CA firms. Use when any PDF, Excel (.xlsx/.xls), CSV, JPG, or PNG file is received or uploaded — including GST returns,... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 141 次。

如何安装 Ca File Processor?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install ca-file-processor」即可一键安装,无需额外配置。

Ca File Processor 是免费的吗?

是的,Ca File Processor 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Ca File Processor 支持哪些平台?

Ca File Processor 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Ca File Processor?

由 purvik6062(@purvik6062)开发并维护,当前版本 v1.0.3。

💬 留言讨论