← 返回 Skills 市场

GI Excel PDF Process

Name: GI Excel PDF Process
Author: laimiaohua

作者 laimiaohua · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ✓ 安全检测通过

293

总下载

当前安装

版本数

在 OpenClaw 中安装

/install gi-excel-pdf-process

功能描述

Process Excel and PDF files - extract data, parse tables, generate reports. Use when working with .xlsx, .xls, .csv, .pdf files, or when the user mentions sp...

使用说明 (SKILL.md)

Excel / PDF 处理

处理 Excel 与 PDF 文件：提取数据、解析表格、生成报告。适用于数据导入导出、报表生成、文档解析等场景。

何时使用

用户提供或请求处理 .xlsx、.xls、.csv、.pdf 文件
用户提到「表格」「Excel」「报表」「PDF 提取」「表单」
需要从文件读取数据或生成可下载文件

可执行脚本：scripts/excel_extract.py（Excel→CSV）、scripts/pdf_extract.py（PDF 文本/表格提取），依赖见 scripts/requirements.txt。

Excel 处理

读取 Excel

import pandas as pd

# 读取整个文件
df = pd.read_excel("file.xlsx", sheet_name=0)  # 第一个 sheet

# 指定 sheet
df = pd.read_excel("file.xlsx", sheet_name="Sheet1")

# 读取 CSV
df = pd.read_csv("file.csv", encoding="utf-8")

写入 Excel

# 单 sheet
df.to_excel("output.xlsx", index=False)

# 多 sheet
with pd.ExcelWriter("output.xlsx") as writer:
    df1.to_excel(writer, sheet_name="汇总", index=False)
    df2.to_excel(writer, sheet_name="明细", index=False)

常用操作

筛选：df[df['列名'] > 0]
去重：df.drop_duplicates(subset=['列名'])
合并：pd.concat([df1, df2]) 或 pd.merge(df1, df2, on='key')
透视：df.pivot_table(values='val', index='row', columns='col', aggfunc='sum')

依赖

pip install pandas openpyxl  # xlsx 需要 openpyxl

PDF 处理

提取文本

import pdfplumber

with pdfplumber.open("file.pdf") as pdf:
    for page in pdf.pages:
        text = page.extract_text()
        if text:
            print(text)

提取表格

with pdfplumber.open("file.pdf") as pdf:
    page = pdf.pages[0]
    tables = page.extract_tables()
    for table in tables:
        # table 为二维列表
        for row in table:
            print(row)

依赖

pip install pdfplumber

若需 OCR（扫描版 PDF）：pip install pdf2image pytesseract，并安装 Tesseract。

报告生成流程

数据准备：从 API/DB 或 Excel 获取数据，用 pandas 清洗
计算/聚合：按业务逻辑生成汇总表
输出：
- Excel：df.to_excel()
- PDF：可用 reportlab 或先生成 Excel 再转 PDF

注意事项

大文件：分块读取或限制行数，避免内存溢出
编码：CSV 常见 utf-8、gbk，先尝试 utf-8
空值：df.fillna(0) 或 df.dropna() 按需处理
日期：pd.to_datetime(df['date_col']) 统一格式

安全使用建议

This skill appears coherent and implements what it describes. Before installing or running: (1) ensure you install the Python requirements in a controlled environment (virtualenv) — pip packages are from PyPI; (2) if you need OCR, install the Tesseract binary separately as noted; (3) only run the scripts on files you trust or have isolated — while pandas/pdfplumber don't execute Excel macros, maliciously crafted files can still cause issues; (4) review and run the scripts locally to confirm behavior before granting the agent access to your files.

功能分析

Type: OpenClaw Skill Name: gi-excel-pdf-process Version: 1.0.0 The skill bundle provides standard utilities for processing Excel and PDF files using well-known libraries like pandas and pdfplumber. The Python scripts (excel_extract.py and pdf_extract.py) and the SKILL.md instructions are consistent with the stated purpose of data extraction and report generation, showing no signs of malicious intent, data exfiltration, or prompt injection.

能力评估

✓ Purpose & Capability

Name/description match the included scripts, SKILL.md, and requirements.txt. The scripts perform Excel→CSV and PDF text/table extraction as advertised; required Python packages are appropriate.

ℹ Instruction Scope

SKILL.md stays within the stated purpose (reading/writing spreadsheets and extracting PDFs). It mentions optional OCR (pdf2image/pytesseract) which requires installing Tesseract (an external binary) — this is expected but the skill does not provide that binary or an install step.

✓ Install Mechanism

No install spec; this is an instruction-only skill with a requirements.txt listing common Python packages (pandas, openpyxl, pdfplumber). These are standard for the task and come from PyPI.

✓ Credentials

The skill requests no environment variables, credentials, or config paths. The scope of access (local files provided by the user) is proportional to the functionality.

✓ Persistence & Privilege

Does not request always:true or modify other skills or system settings. It runs as-invoked and has no privileged persistence.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install gi-excel-pdf-process
安装完成后，直接呼叫该 Skill 的名称或使用 /gi-excel-pdf-process 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

Initial release. Gravitech Innovations.

元数据

Slug gi-excel-pdf-process

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题