← Back to Skills Marketplace
jktllsqaq

HUDC Bidding Information Capture

by JKTLLSQAQ · GitHub ↗ · v6.0.0 · MIT-0
cross-platform ✓ Security Clean
137
Downloads
1
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install hudc-bidding-information-capture
Description
Intelligently analyzes SGCC bidding documents in Word/PDF/Excel formats, extracting 23 key fields with automated qualification backfilling and deadline highl...
README (SKILL.md)

\r \r

hbdc · 国网招标文件分析 Skill (v6)\r

\r

目录约定\r

\r

~/Desktop/sgcc_files/\r
├── 项目A/\r
│   ├── xxx-招标公告.docx          ← 主公告 (docx 或 pdf, 自动识别)\r
│   ├── 公告附件_资质要求.xlsx      ← 可选, 资质占位符回填用\r
│   └── 重要提醒.docx               ← 自动忽略\r
├── 项目B/\r
│   └── xxx-采购公告.pdf\r
├── 散文件_招标公告.docx            ← 根目录散放, 当成单独项目\r
└── ...\r
```\r
\r
输出报告: `~/Desktop/sgcc_result.xlsx`\r
\r
---\r
\r
## 执行方式\r
\r
直接运行固定脚本, **禁止自己写代码替代**:\r
\r
```bash\r
python3 ~/.openclaw/workspace/skills/hbdc/scripts/analyze.py\r
```\r
\r
---\r
\r
## v6 核心策略\r
\r
### 1. Word/PDF 主抽 → Excel 补盲 → 段落回退\r
\r
| 优先级 | 来源 | 说明 |\r
|--------|------|------|\r
| 1 | Word/PDF 表格 | 主公告里的包级明细表 |\r
| 2 | xlsx 附件 | 需求表/资质附件补充 |\r
| 3 | Word/PDF 正文 | 无表格时段落扫描 (标记需手动补盲) |\r
\r
### 2. PDF 专属标题提取\r
\r
用 `pdfplumber` 字符级字号定位封面最大字号文字作为项目名称, \r
比正则扫描段落更准确。PDF 读取失败后自动切换参数重试一次, \r
若仍失败则在终端明确提示。\r
\r
### 3. "详见附件X" 占位符识别与回填\r
\r
资质列若为 `详见附件1` / `详见附件二` 等纯占位符:\r
1. 自动扫描同目录下 xlsx 中的资质表\r
2. 按 `(分标编号, 包号)` 主键回填\r
3. 找不到时降级为项目级通用资质 (前缀 `【通用】`)\r
\r
### 4. 关键词去冗余\r
\r
长词命中后自动丢弃其子串短词:\r
- `咨询服务` 命中 → 丢弃 `咨询`\r
- `储能系统` 命中 → 丢弃 `储能`\r
- `宣传服务` 命中 → 丢弃 `宣传`\r
\r
### 5. 资质列合并策略\r
\r
```\r
包级 资质条件 + 业绩要求 + 主要人员\r
  ↓ (为空或占位符)\r
xlsx 附件资质表回填\r
  ↓ (仍为空)\r
项目级资格要求章节 (过滤套话后, 前缀 【通用】)\r
```\r
\r
过滤的模板套话: 依法注册、失信被执行人、信用中国、联合体、破产、黑名单等。\r
\r
保留的实质条款: 甲/乙/丙级、建造师、ISO、安全生产许可证、总承包、业绩 等。\r
\r
---\r
\r
## 输出 23 列\r
\r
| # | 列名 | 来源 |\r
|---|------|------|\r
| 1 | 序号 | 自动 |\r
| 2 | 项目名称 | PDF字号法 → 段落正则 |\r
| 3 | 招标编号 | 正则 |\r
| 4 | 分标编号 | Word表格 → xlsx |\r
| 5 | 分标名称 | Word表格 → xlsx |\r
| 6 | 包号 | Word表格 → xlsx |\r
| 7 | 需求部门/签订单位 | Word表格 → xlsx |\r
| 8 | 子项目名称 | Word表格 → xlsx |\r
| 9 | 项目概况与招标范围 | Word表格 → xlsx |\r
| 10 | **资质/资格要求** | 包级聚合 → 附件回填 → 项目级兜底 |\r
| 11 | 合同文本编号 | Word表格 → xlsx |\r
| 12 | 实施地点 | Word表格 → xlsx |\r
| 13 | 工期/服务期 | Word表格 → xlsx |\r
| 14 | 报价方式 | Word表格 → xlsx |\r
| 15 | 预算金额(万元) | Word表格 → xlsx (元自动换万元) |\r
| 16 | 最高限价 | Word表格 → xlsx |\r
| 17 | 招标起止时间 | 正则 (截止时间染色) |\r
| 18 | **开标时间地点** ✨ | 正则 |\r
| 19 | **投标保证金** ✨ | 正则 (含"不收取"判断) |\r
| 20 | **评标办法** ✨ | 正则 (综合评估/合理低价/综合评分等) |\r
| 21 | **联系人及电话** ✨ | 正则 + xlsx 联系人字段 |\r
| 22 | 匹配关键词 | 关键词引擎 (去冗余) |\r
| 23 | 匹配来源 | Word表格 / PDF表格 / Excel附件 / 正文 |\r
\r
### 截止时间染色规则\r
\r
| 颜色 | 条件 |\r
|------|------|\r
| 🔴 红色 | 距截止 ≤ 3 天 |\r
| 🟡 黄色 | 距截止 ≤ 7 天 |\r
| ⬜ 灰色 | 已过截止时间 |\r
| 无色 | 7 天以上, 正常 |\r
\r
---\r
\r
## 执行流程\r
\r
### 步骤一: 运行脚本\r
\r
```bash\r
python3 ~/.openclaw/workspace/skills/hbdc/scripts/analyze.py\r
```\r
\r
### 步骤二: 汇报结果\r
\r
向用户汇报:\r
1. 扫描了多少个项目 + 每个项目识别的元数据 (名称/编号/时间/开标/保证金/评标办法)\r
2. 每个项目「抽出多少包级记录 → 关键词命中多少条」\r
3. 命中记录的来源标签\r
4. 报告路径: `~/Desktop/sgcc_result.xlsx`\r
5. **特别提醒**「来源是 Word/PDF正文」的项目 — 需手动补盲\r
\r
### 步骤三: AI 辅助补盲\r
\r
对以下情况主动接管:\r
- PDF 表格抽取失败 → 直接打开 PDF 阅读页面内容手动填写\r
- 来源标签为「Word/PDF正文」→ 手动定位关键章节填补字段\r
- 项目名称识别错误 → 从原文找正确标题\r
- `资质/资格要求` 仍含 "详见附件" → 读取对应附件补全\r
\r
### 步骤四: 回答追问\r
\r
用户问某项目细节时, 直接读取对应文件夹下的 word/pdf/xlsx 展示。\r
\r
---\r
\r
## 自定义关键词\r
\r
编辑 `config/keywords.json` 文件, 修改后无需改动脚本:\r
\r
```json\r
{\r
  "keywords": {\r
    "我的自定义类": ["关键词A", "关键词B"]\r
  },\r
  "short_keywords": {\r
    "我的自定义类": ["短词"]\r
  }\r
}\r
```\r
\r
---\r
\r
## 禁止行为\r
\r
1. **禁止自己编写分析脚本**, 必须运行 `analyze.py`\r
2. **禁止用文件名匹配关键词**, 必须读文件内部内容\r
3. **禁止打开浏览器**, 本技能只操作本地文件\r
4. **禁止直接在 SKILL.md 里修改关键词**, 使用 `config/keywords.json`
Usage Guidance
This skill appears to do what it says: parse local招标 documents and produce a structured Excel. Before installing/running, consider: (1) review scripts/analyze.py yourself (it will run pip to install python-docx, openpyxl, pdfplumber), (2) run it in a controlled environment or virtualenv to avoid altering your global Python environment, (3) be aware it will read files under the configured input directory (may contain PII or confidential vendor/customer data), (4) confirm you are comfortable with automatic pip installs (the packages are common but pip will download from PyPI), and (5) do not run it as root. If you want extra assurance, inspect the remaining truncated portions of analyze.py (the attachment-loading and Excel-writing logic) to confirm there are no unexpected network calls or shell executions before first run.
Capability Analysis
Type: OpenClaw Skill Name: hudc-bidding-information-capture Version: 6.0.0 The skill bundle is a specialized automation tool designed to parse and analyze Chinese bidding documents (SGCC) from the local filesystem. The core logic in `scripts/analyze.py` uses legitimate libraries like `pdfplumber` and `python-docx` to extract structured data from Word and PDF files, specifically looking for project names, bidding numbers, and contact information. While the script uses `os.system` to automatically install its own dependencies, the package names are hardcoded and the behavior is transparently documented. No evidence of data exfiltration, unauthorized network access, or malicious prompt injection was found; the instructions in `SKILL.md` are focused on guiding the agent through a structured data-processing workflow.
Capability Assessment
Purpose & Capability
Name/description promise (extract 23 fields from Word/PDF/Excel SGCC bidding docs) aligns with the provided script and config: the script reads .docx/.pdf/.xlsx, extracts tables and paragraph text, applies regex and keyword engines, and writes a structured Excel report.
Instruction Scope
SKILL.md directs the agent to run the shipped analyze.py and operate on local ~/Desktop/sgcc_files/ project folders; the script's behavior (parsing docs, extracting fields, prompting for manual補盲 when needed, and returning results) stays within the stated purpose. Note: the skill will read file contents in the specified directories (including contact names/phones and any PII in documents).
Install Mechanism
There is no registry install spec, but analyze.py will attempt to auto-install python-docx, openpyxl and pdfplumber via os.system('pip3 install ...'). This is expected for the functionality but does modify the Python environment (uses --break-system-packages and runs pip silently). It's a moderate-risk behavior (network download of third‑party packages) but the packages are standard and coherent with purpose.
Credentials
The skill declares no environment variables, no credentials, and the code does not read env vars or request unrelated secrets. It only reads local files and writes an output Excel on the user's Desktop which is proportional to its task.
Persistence & Privilege
Skill is not marked always:true and does not request persistent or elevated system privileges. It writes its results to ~/Desktop/sgcc_result.xlsx and uses local config/keywords.json only; it does not modify other skills or system-wide agent settings.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install hudc-bidding-information-capture
  3. After installation, invoke the skill by name or use /hudc-bidding-information-capture
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v6.0.0
hbdc v6.0.0 is a major update enhancing State Grid tender document analysis. - Adds prioritized multi-source extraction: Word/PDF tables, Excel supplement, and paragraph fallback. - Introduces PDF title extraction by font size for higher project name accuracy. - Automates placeholder ("详见附件") recognition and qualification refill from attachments. - Implements keyword deduplication for concise tagging. - Refines qualification merging strategy, separating essential clauses from templates. - Upgrades output to 23 columns with improved deadline coloring and new fields like evaluation method and bid bond. - Requires only fixed script execution—no manual code modifications needed. - Custom keywords can be managed via a JSON config, not in the SKILL file.Bidding and tendering, information retrieval, data analysis, tools, efficiency
Metadata
Slug hudc-bidding-information-capture
Version 6.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is HUDC Bidding Information Capture?

Intelligently analyzes SGCC bidding documents in Word/PDF/Excel formats, extracting 23 key fields with automated qualification backfilling and deadline highl... It is an AI Agent Skill for Claude Code / OpenClaw, with 137 downloads so far.

How do I install HUDC Bidding Information Capture?

Run "/install hudc-bidding-information-capture" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is HUDC Bidding Information Capture free?

Yes, HUDC Bidding Information Capture is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does HUDC Bidding Information Capture support?

HUDC Bidding Information Capture is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created HUDC Bidding Information Capture?

It is built and maintained by JKTLLSQAQ (@jktllsqaq); the current version is v6.0.0.

💬 Comments