← Back to Skills Marketplace

GI Excel PDF Process

Name: GI Excel PDF Process
Author: laimiaohua

by laimiaohua · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ✓ Security Clean

293

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install gi-excel-pdf-process

Description

Process Excel and PDF files - extract data, parse tables, generate reports. Use when working with .xlsx, .xls, .csv, .pdf files, or when the user mentions sp...

README (SKILL.md)

Excel / PDF 处理

处理 Excel 与 PDF 文件：提取数据、解析表格、生成报告。适用于数据导入导出、报表生成、文档解析等场景。

何时使用

用户提供或请求处理 .xlsx、.xls、.csv、.pdf 文件
用户提到「表格」「Excel」「报表」「PDF 提取」「表单」
需要从文件读取数据或生成可下载文件

可执行脚本：scripts/excel_extract.py（Excel→CSV）、scripts/pdf_extract.py（PDF 文本/表格提取），依赖见 scripts/requirements.txt。

Excel 处理

读取 Excel

import pandas as pd

# 读取整个文件
df = pd.read_excel("file.xlsx", sheet_name=0)  # 第一个 sheet

# 指定 sheet
df = pd.read_excel("file.xlsx", sheet_name="Sheet1")

# 读取 CSV
df = pd.read_csv("file.csv", encoding="utf-8")

写入 Excel

# 单 sheet
df.to_excel("output.xlsx", index=False)

# 多 sheet
with pd.ExcelWriter("output.xlsx") as writer:
    df1.to_excel(writer, sheet_name="汇总", index=False)
    df2.to_excel(writer, sheet_name="明细", index=False)

常用操作

筛选：df[df['列名'] > 0]
去重：df.drop_duplicates(subset=['列名'])
合并：pd.concat([df1, df2]) 或 pd.merge(df1, df2, on='key')
透视：df.pivot_table(values='val', index='row', columns='col', aggfunc='sum')

依赖

pip install pandas openpyxl  # xlsx 需要 openpyxl

PDF 处理

提取文本

import pdfplumber

with pdfplumber.open("file.pdf") as pdf:
    for page in pdf.pages:
        text = page.extract_text()
        if text:
            print(text)

提取表格

with pdfplumber.open("file.pdf") as pdf:
    page = pdf.pages[0]
    tables = page.extract_tables()
    for table in tables:
        # table 为二维列表
        for row in table:
            print(row)

依赖

pip install pdfplumber

若需 OCR（扫描版 PDF）：pip install pdf2image pytesseract，并安装 Tesseract。

报告生成流程

数据准备：从 API/DB 或 Excel 获取数据，用 pandas 清洗
计算/聚合：按业务逻辑生成汇总表
输出：
- Excel：df.to_excel()
- PDF：可用 reportlab 或先生成 Excel 再转 PDF

注意事项

大文件：分块读取或限制行数，避免内存溢出
编码：CSV 常见 utf-8、gbk，先尝试 utf-8
空值：df.fillna(0) 或 df.dropna() 按需处理
日期：pd.to_datetime(df['date_col']) 统一格式

Usage Guidance

This skill appears coherent and implements what it describes. Before installing or running: (1) ensure you install the Python requirements in a controlled environment (virtualenv) — pip packages are from PyPI; (2) if you need OCR, install the Tesseract binary separately as noted; (3) only run the scripts on files you trust or have isolated — while pandas/pdfplumber don't execute Excel macros, maliciously crafted files can still cause issues; (4) review and run the scripts locally to confirm behavior before granting the agent access to your files.

Capability Analysis

Type: OpenClaw Skill Name: gi-excel-pdf-process Version: 1.0.0 The skill bundle provides standard utilities for processing Excel and PDF files using well-known libraries like pandas and pdfplumber. The Python scripts (excel_extract.py and pdf_extract.py) and the SKILL.md instructions are consistent with the stated purpose of data extraction and report generation, showing no signs of malicious intent, data exfiltration, or prompt injection.

Capability Assessment

✓ Purpose & Capability

Name/description match the included scripts, SKILL.md, and requirements.txt. The scripts perform Excel→CSV and PDF text/table extraction as advertised; required Python packages are appropriate.

ℹ Instruction Scope

SKILL.md stays within the stated purpose (reading/writing spreadsheets and extracting PDFs). It mentions optional OCR (pdf2image/pytesseract) which requires installing Tesseract (an external binary) — this is expected but the skill does not provide that binary or an install step.

✓ Install Mechanism

No install spec; this is an instruction-only skill with a requirements.txt listing common Python packages (pandas, openpyxl, pdfplumber). These are standard for the task and come from PyPI.

✓ Credentials

The skill requests no environment variables, credentials, or config paths. The scope of access (local files provided by the user) is proportional to the functionality.

✓ Persistence & Privilege

Does not request always:true or modify other skills or system settings. It runs as-invoked and has no privileged persistence.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install gi-excel-pdf-process
After installation, invoke the skill by name or use /gi-excel-pdf-process
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

Initial release. Gravitech Innovations.

Metadata

Slug gi-excel-pdf-process

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is GI Excel PDF Process?

Process Excel and PDF files - extract data, parse tables, generate reports. Use when working with .xlsx, .xls, .csv, .pdf files, or when the user mentions sp... It is an AI Agent Skill for Claude Code / OpenClaw, with 293 downloads so far.

How do I install GI Excel PDF Process?

Run "/install gi-excel-pdf-process" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is GI Excel PDF Process free?

Yes, GI Excel PDF Process is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does GI Excel PDF Process support?

GI Excel PDF Process is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created GI Excel PDF Process?

It is built and maintained by laimiaohua (@laimiaohua); the current version is v1.0.0.

More Skills