← Back to Skills Marketplace
meetpaladiya44

Tally Extractor

by Meet Paladiya · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ pending
48
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install tally-extractor
Description
Instance A skill for TallyPrime (Extractor). Parses invoices/bills from PDF/image via Telegram/WhatsApp, extracts structured data (party, GSTIN, items, taxes...
README (SKILL.md)

\r \r

Tally Extractor Skill\r

\r Extract invoice/bill data from PDFs and images received via Telegram or WhatsApp, convert to a canonical JSON payload, and POST to the bridge service on Instance B for Tally entry.\r \r

  • No direct Tally access: This skill does NOT connect to TallyPrime. Voucher posting is handled by tally-skill on Instance B.\r
  • Telegram/WhatsApp interface: User sends PDF/image → this skill extracts → bridge → Tally.\r
  • Structured output: Always emit the canonical JSON schema (see reference/extraction-schema.md).\r \r

Hero Use Case: Telegram invoice → Tally entry\r

\r Goal: zero manual entry for CAs handling many clients.\r \r

  1. User sends invoice PDF/image via Telegram.\r
  2. This skill extracts: company, party, GSTIN, date, invoice no, items (description, HSN, qty, rate, tax), total.\r
  3. Validate extraction: tax math check, GSTIN format, required fields present.\r
  4. POST canonical JSON to bridge (/v1/post-voucher).\r
  5. Receive result from bridge and reply to user in accountant-friendly language.\r \r

When to use this skill\r

\r Scope note: This skill is for Instance A (Extractor). It does not post to Tally — that responsibility belongs to tally-skill on Instance B.\r \r Use when:\r \r

  • User sends a PDF or image of an invoice/bill via Telegram/WhatsApp\r
  • User requests to extract data from a document\r
  • User asks to "add this invoice to Tally" (extract + forward to bridge)\r
  • User provides invoice details as text message (parse and forward)\r \r Do NOT use this skill for:\r \r
  • Direct Tally operations (reports, ledger queries) — those go to Instance B\r
  • PDF generation (tallyca runs on Instance B)\r
  • Master management (ledgers, groups) — that's on Instance B\r \r

Critical rules (must follow)\r

\r

  1. Never hallucinate data: If a field is unclear or missing from the document, mark it as low-confidence or ask the user. Do not invent GSTINs, HSN codes, or amounts.\r
  2. Validate GSTIN format: 15 characters, pattern ^[0-9]{2}[A-Z]{5}[0-9]{4}[A-Z]{1}[1-9A-Z]{1}Z[0-9A-Z]{1}$. If invalid, flag it.\r
  3. Validate HSN format: 4-8 digits. If invalid, flag it.\r
  4. Tax math check: taxable_amount × tax_rate / 100 ≈ tax_amount (within ±0.01). If mismatch, flag it.\r
  5. Total check: Sum of item amounts + taxes = total (within ±1 for rounding). If mismatch, flag it.\r
  6. Place of Supply: Derive from first 2 digits of party GSTIN (state code). Map to state name using reference/extraction-schema.md state code table.\r
  7. Inter-state vs Intra-state: If company GSTIN state ≠ party GSTIN state → IGST. If same state → CGST + SGST.\r
  8. Dates: Extract as YYYY-MM-DD. Common formats on invoices: DD/MM/YYYY, DD-MM-YYYY, DD.MM.YYYY, MMM DD, YYYY.\r
  9. Idempotency key: Generate deterministic GUID using pattern {companyShort}-{voucherType}-{invoiceNumber}-{date}.\r
  10. Confidence scores: Assign confidence (0.0-1.0) to each extracted field. Flag fields with confidence \x3C 0.7 for user confirmation.\r \r

Extraction workflow\r

\r

Step 1: Receive document\r

\r User sends PDF or image via Telegram. The document is available for OCR/vision processing.\r \r

Step 2: Check bridge health\r

\r Before extraction, verify Instance B is reachable:\r \r

curl -s -X GET \\r
  -H "Authorization: Bearer $BRIDGE_BEARER" \\r
  "$BRIDGE_URL/v1/health"\r
```\r
\r
Expected response:\r
\r
```json\r
{\r
  "tally": "ok",\r
  "company_default": "ABC Company",\r
  "version": "1.0.0"\r
}\r
```\r
\r
If `tally: "down"`, inform user: "Tally is not running on the client machine. Please ask them to open TallyPrime."\r
\r
### Step 3: Extract data\r
\r
Parse the document and extract:\r
\r
| Field | Source | Validation |\r
|---|---|---|\r
| Party name | Invoice header | Required |\r
| Party GSTIN | Near party name or GST section | 15-char format |\r
| Company GSTIN | Near company name or letterhead | 15-char format |\r
| Invoice number | Invoice header | Required |\r
| Invoice date | Invoice header | Parse to `YYYY-MM-DD` |\r
| Items | Line items table | At least one |\r
| Item description | Line item | Required per item |\r
| HSN/SAC | Line item | 4-8 digits |\r
| Quantity | Line item | Numeric |\r
| Unit | Line item | Optional |\r
| Rate | Line item | Numeric |\r
| Tax rate | Line item or GST section | Percentage |\r
| CGST/SGST/IGST amounts | GST section | Numeric |\r
| Total | Invoice footer | Required, must balance |\r
| Narration | Notes section | Optional |\r
\r
### Step 4: Determine voucher type\r
\r
| Document type | Voucher type |\r
|---|---|\r
| Purchase invoice (we are buyer) | `Purchase` |\r
| Sales invoice (we are seller) | `Sales` |\r
| Payment receipt | `Receipt` |\r
| Payment voucher | `Payment` |\r
| Credit note | `CreditNote` |\r
| Debit note | `DebitNote` |\r
\r
Clues:\r
- "Tax Invoice" with company as seller → Sales\r
- "Tax Invoice" with company as buyer → Purchase\r
- "Receipt" or "Payment Received" → Receipt\r
- "Credit Note" in header → CreditNote\r
\r
If ambiguous, ask user: "Is this a purchase (you bought) or sales (you sold)?"\r
\r
### Step 5: Validate extraction\r
\r
Run these checks before sending to bridge:\r
\r
| Check | Rule | Action if fails |\r
|---|---|---|\r
| Required fields | All required fields present | List missing fields, ask user |\r
| GSTIN format | 15-char regex match | Flag invalid, ask user |\r
| HSN format | 4-8 digits | Flag invalid, ask user |\r
| Tax math | taxable × rate / 100 ≈ tax | Flag mismatch, ask user |\r
| Total balance | items + taxes = total (±1) | Flag mismatch, ask user |\r
| Date parseable | Valid date | Ask user for correct date |\r
\r
### Step 6: Build canonical JSON\r
\r
Construct the JSON payload per `reference/extraction-schema.md`:\r
\r
```json\r
{\r
  "schema_version": "1.0",\r
  "request_id": "uuid-v4",\r
  "idempotency_key": "abc-purchase-xyz-186-20260518",\r
  "company": "ABC Company",\r
  "voucher": {\r
    "type": "Purchase",\r
    "date": "2026-05-18",\r
    "number": "186",\r
    "is_invoice_mode": true,\r
    "voucher_class": null,\r
    "narration": "Against Invoice 186",\r
    "party": {\r
      "name": "XYZ Party",\r
      "gstin": "27AABCU9603R1ZM",\r
      "place_of_supply": "Maharashtra",\r
      "registration_type": "Regular"\r
    },\r
    "company_gstin": "27AABCU9603R1ZN",\r
    "items": [...],\r
    "taxes": {...},\r
    "total": 46199.83,\r
    "bill_allocations": [...]\r
  },\r
  "source": {\r
    "kind": "pdf",\r
    "filename": "invoice_186.pdf",\r
    "extracted_at": "2026-05-18T10:30:00Z"\r
  },\r
  "confidence": {\r
    "overall": 0.93,\r
    "fields": {...}\r
  }\r
}\r
```\r
\r
### Step 7: POST to bridge\r
\r
Send the JSON to Instance B:\r
\r
```bash\r
# Compute HMAC\r
BODY='{"schema_version":"1.0",...}'\r
SIGNATURE=$(echo -n "$BODY" | openssl dgst -sha256 -hmac "$BRIDGE_HMAC_SECRET" | cut -d' ' -f2)\r
\r
curl -X POST \\r
  -H "Content-Type: application/json" \\r
  -H "Authorization: Bearer $BRIDGE_BEARER" \\r
  -H "X-Signature: hmac-sha256=$SIGNATURE" \\r
  -H "Idempotency-Key: abc-purchase-xyz-186-20260518" \\r
  -d "$BODY" \\r
  "$BRIDGE_URL/v1/post-voucher"\r
```\r
\r
Full HTTP contract in `reference/bridge.md`.\r
\r
### Step 8: Handle response\r
\r
#### Success\r
\r
```json\r
{\r
  "status": "posted",\r
  "guid": "abc-purchase-xyz-186-20260518",\r
  "voucher_number": "186",\r
  "company": "ABC Company",\r
  "summary": "Purchase voucher posted: XYZ Party, ₹46,199.83",\r
  "masters_created": ["XYZ Party"]\r
}\r
```\r
\r
Reply to user (see `reference/prompts.md`):\r
\r
> Entry posted to Tally.\r
> \r
> **Company:** ABC Company  \r
> **Type:** Purchase  \r
> **Party:** XYZ Party  \r
> **Invoice No:** 186  \r
> **Date:** 18 May 2026  \r
> **Amount:** ₹46,199.83 (Taxable: ₹39,152.40 + IGST: ₹7,047.43)\r
> \r
> New ledger created: XYZ Party\r
\r
#### Needs Clarification\r
\r
```json\r
{\r
  "status": "needs_clarification",\r
  "missing_fields": ["voucher.voucher_class"],\r
  "message": "Please confirm the voucher class name (e.g., 'Purchase @ 18 %')."\r
}\r
```\r
\r
Forward the question to the user.\r
\r
#### Error\r
\r
```json\r
{\r
  "status": "error",\r
  "error_code": "TALLY_UNREACHABLE",\r
  "message": "Could not connect to Tally."\r
}\r
```\r
\r
Reply to user: "Tally is not responding. Please check that TallyPrime is open and try again."\r
\r
## Company name handling\r
\r
The company name must match exactly in TallyPrime. Strategies:\r
\r
1. **Explicit in document**: Extract from invoice letterhead\r
2. **User specified**: User may say "Add to ABC Company"\r
3. **Default from bridge**: `/v1/health` returns `company_default`\r
4. **Ask user**: If unclear, ask "Which Tally company should this entry go to?"\r
\r
## Voucher class handling\r
\r
Some Tally companies use voucher classes for automatic GST splitting. This skill does NOT know which class to use — that's Instance B's job.\r
\r
- If user specifies class in message (e.g., "use Sales @ 18 %"), include in JSON\r
- If not specified, leave `voucher_class: null` and let Instance B handle it\r
- If Instance B returns `needs_clarification` for class, forward to user\r
\r
## Confidence scoring\r
\r
Assign confidence scores based on:\r
\r
| Extraction quality | Confidence |\r
|---|---|\r
| Clear text, high contrast, exact match | 0.95 - 1.0 |\r
| Readable but some ambiguity | 0.7 - 0.94 |\r
| Blurry, low contrast, guessed | 0.4 - 0.69 |\r
| Highly uncertain | 0.0 - 0.39 |\r
\r
Fields with confidence \x3C 0.7 should be flagged for user confirmation before posting.\r
\r
## State code to state name mapping\r
\r
First 2 digits of GSTIN → Place of Supply:\r
\r
| Code | State |\r
|---|---|\r
| 01 | Jammu and Kashmir |\r
| 02 | Himachal Pradesh |\r
| 03 | Punjab |\r
| 04 | Chandigarh |\r
| 05 | Uttarakhand |\r
| 06 | Haryana |\r
| 07 | Delhi |\r
| 08 | Rajasthan |\r
| 09 | Uttar Pradesh |\r
| 10 | Bihar |\r
| 11 | Sikkim |\r
| 12 | Arunachal Pradesh |\r
| 13 | Nagaland |\r
| 14 | Manipur |\r
| 15 | Mizoram |\r
| 16 | Tripura |\r
| 17 | Meghalaya |\r
| 18 | Assam |\r
| 19 | West Bengal |\r
| 20 | Jharkhand |\r
| 21 | Odisha |\r
| 22 | Chhattisgarh |\r
| 23 | Madhya Pradesh |\r
| 24 | Gujarat |\r
| 25 | Daman and Diu |\r
| 26 | Dadra and Nagar Haveli |\r
| 27 | Maharashtra |\r
| 28 | Andhra Pradesh (Old) |\r
| 29 | Karnataka |\r
| 30 | Goa |\r
| 31 | Lakshadweep |\r
| 32 | Kerala |\r
| 33 | Tamil Nadu |\r
| 34 | Puducherry |\r
| 35 | Andaman and Nicobar Islands |\r
| 36 | Telangana |\r
| 37 | Andhra Pradesh |\r
| 38 | Ladakh |\r
\r
## Deployment topology\r
\r
There is exactly **one chat interface** (Telegram on Instance A). Instance B has no Telegram/WhatsApp — only the bridge HTTP endpoint.\r
\r
```mermaid\r
flowchart LR\r
  User["CA / client on Telegram"] -->|"PDF / image"| TgBot["Telegram bot (Instance A)"]\r
  subgraph A["Instance A on dev EC2"]\r
    TgBot --> OpenClawA["OpenClaw\
LLM: Codex Plus session"]\r
    OpenClawA --> ExtractorSkill["tally-extractor-skill"]\r
  end\r
  OpenClawA -->|"HTTPS POST /v1/post-voucher"| Bridge["bridge-service\
(client box)"]\r
  subgraph B["Instance B on client CPU"]\r
    Bridge --> OpenClawB["OpenClaw\
LLM: OpenAI API key"]\r
    OpenClawB --> TallySkill["tally-skill"]\r
    TallySkill --> Tally["TallyPrime\
localhost:9000"]\r
  end\r
  Bridge --> OpenClawA\r
  OpenClawA --> User\r
```\r
\r
| Environment | Instance A | Instance B | Tally access |\r
|---|---|---|---|\r
| **Production** | Your EC2 (Telegram, Codex Plus) | Client mini-PC beside Tally | B → `http://localhost:9000` (no ngrok for Tally) |\r
| **Dev / testing** | Same Ubuntu EC2 | Second OpenClaw on same EC2 | B uses ngrok URL to remote Tally |\r
\r
Instance A reaches B via an inbound tunnel on the **client box** (`ngrok http 8787` or Cloudflare Tunnel). Share `BRIDGE_URL`, `BRIDGE_BEARER`, and `BRIDGE_HMAC_SECRET` with the team running A.\r
\r
## Instance A configuration checklist\r
\r
| Step | Setting | Value |\r
|---|---|---|\r
| 1 | Host | Ubuntu EC2 (or dev server) |\r
| 2 | Install | Node.js, OpenClaw |\r
| 3 | LLM | OpenAI **ChatGPT Codex Plus** session (subscription) |\r
| 4 | Skill loaded | `tally-extractor-skill/` only — **do not** load `tally-skill/` |\r
| 5 | Channel | Telegram bot token on A only |\r
| 6 | `BRIDGE_URL` | `https://\x3Cclient-tunnel>.ngrok.app` (no trailing slash) |\r
| 7 | `BRIDGE_BEARER` | Shared secret from client team |\r
| 8 | `BRIDGE_HMAC_SECRET` | Shared HMAC secret from client team |\r
| 9 | **Do not set** | `TALLY_URL` on Instance A |\r
| 10 | Preflight | `curl -H "Authorization: Bearer $BRIDGE_BEARER" $BRIDGE_URL/v1/health` → `tally: ok` |\r
\r
## Reference files\r
\r
- **Extraction schema**: `reference/extraction-schema.md` — Full JSON schema, examples\r
- **Bridge HTTP contract**: `reference/bridge.md` — Endpoints, auth, errors\r
- **Reply templates**: `reference/prompts.md` — User-facing messages\r
Capability Tags
cryptocan-make-purchasesrequires-sensitive-credentials
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install tally-extractor
  3. After installation, invoke the skill by name or use /tally-extractor
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release
Metadata
Slug tally-extractor
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Tally Extractor?

Instance A skill for TallyPrime (Extractor). Parses invoices/bills from PDF/image via Telegram/WhatsApp, extracts structured data (party, GSTIN, items, taxes... It is an AI Agent Skill for Claude Code / OpenClaw, with 48 downloads so far.

How do I install Tally Extractor?

Run "/install tally-extractor" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Tally Extractor free?

Yes, Tally Extractor is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Tally Extractor support?

Tally Extractor is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Tally Extractor?

It is built and maintained by Meet Paladiya (@meetpaladiya44); the current version is v1.0.0.

💬 Comments