← 返回 Skills 市场
Category Link Collector
作者
QirongZhang
· GitHub ↗
· v1.0.0
· MIT-0
219
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install category-link-collector
功能描述
采集电商网站分类链接信息,提取分类层级数据并保存为CSV文件。当需要从电商网站分类链接中提取结构化数据时使用此技能。
使用说明 (SKILL.md)
Category Link Collector Skill
功能
- 从给定的分类链接URL中提取分类信息
- 解析分类路径,提取一级和二级分类
- 生成结构化的CSV文件
- 支持自定义输出目录和文件名
使用方法
基本用法
采集以下分类链接:
https://lulumonclick-eu.shop/collections/women-women-clothes-tank-tops
https://lulumonclick-eu.shop/collections/women-women-clothes-bras-underwear
参数说明
- 域名变量: 自动从链接中提取域名部分
- 输出目录: 默认为
/Users/zhangqirong/工作/caiji,可自定义 - 文件名: 自动使用域名作为文件名(如
lulumonclick-eu.shop.csv)
数据结构
生成的CSV文件包含以下列:
- 完整链接: 原始分类链接
- 分类路径: 从URL中提取的分类路径(如
women-women-clothes-tank-tops) - 域名: 网站域名
- 1级分类: 提取的一级分类名称(如
Women) - 2级分类: 提取的二级分类名称(如
Tank Tops) - 3级分类: 提取的三级分类名称(如存在)
- 4级分类: 提取的四级分类名称(如存在)
- ...: 更多级别分类(根据实际深度动态生成)
多级分类支持
技能现在支持无限级分类提取:
- 自动识别分类层级深度
- 动态生成CSV列(1级分类、2级分类、3级分类...)
- 智能合并特殊词组(T-shirts, Co-ord等)
- 正确处理数字范围(0-18 months等)
处理逻辑
- 从URL中提取域名部分
- 从
/collections/后提取分类路径 - 解析分类路径:
- 使用智能算法分割分类路径
- 识别一级分类(Women, Men, Kids, Beauty等)
- 提取所有级别的下级分类
- 智能合并特殊词组和数字范围
- 根据最大分类深度动态生成CSV列
- 生成CSV文件,保存到指定目录
示例
输入链接:
https://lulumonclick-eu.shop/collections/women-women-clothes-tank-tops
输出CSV行:
| 完整链接 | 分类路径 | 一级分类 | 二级分类 | 域名 |
|---|---|---|---|---|
| https://lulumonclick-eu.shop/collections/women-women-clothes-tank-tops | women-women-clothes-tank-tops | Women | Tank Tops | lulumonclick-eu.shop |
文件位置
- Skill主文件:
SKILL.md - 脚本文件:
scripts/collect_categories.py - 配置文件:
config/settings.json(可选)
依赖
- Python 3.x
- pandas 库 (用于CSV处理)
扩展能力
后续可以扩展的功能:
- 批量处理多个链接
- 支持更多分类层级(三级、四级等)
- 自动去重和验证
- 支持不同的URL格式
- 添加时间戳和采集状态
- 集成到自动化工作流中
安全使用建议
This package appears to do what it says (parse /collections/... URLs into hierarchical CSV rows), but there are several red flags you should consider before installing or running it:
- Default output directory: The code and docs hardcode /Users/zhangqirong/工作/caiji as the default output path. Override output_dir on first use or edit config/settings.json to avoid writing files into an unexpected location.
- Inconsistencies in packaging: Tests and README/SKILL.md expect different CSV column names and filenames than the implementation actually produces (e.g., tests expect '一级分类' and filenames like example_com.csv, while code produces '1级分类' and filenames like example_com_multilevel.csv). This indicates sloppy packaging and means bundled tests may fail — review/adjust the code or tests before trusting results.
- No network calls found: The scripts parse given URLs but do not fetch pages. If you planned to fetch remote pages, the code does not do that; check for additional 'fetch' logic if needed.
- Dependencies: Ensure Python 3.x and pandas are installed in a controlled environment before running.
- Domains in examples: Example links reference domains like zaraoutlet.top and lulumonclick-eu.shop. Those are only example inputs; the code won't contact them, but double-check any example data you reuse.
Recommended actions: run the unit tests locally after fixing the column/filename mismatches or update the test expectations; change the hardcoded default output_dir to a sensible relative or configurable default; inspect and run the scripts in an isolated environment (temporary directory) the first time to confirm behavior. If you plan to integrate this into an agent, ensure the agent won't expose these CSV files to external endpoints (the skill itself does not transmit data externally).
功能分析
Type: OpenClaw Skill
Name: category-link-collector
Version: 1.0.0
The skill bundle is a specialized tool for parsing e-commerce category hierarchies from URLs and saving the structured data into CSV files. While it contains hardcoded local file paths specific to the developer's environment (e.g., `/Users/zhangqirong/工作/caiji` in `scripts/collect_categories.py` and `config/settings.json`), which is a functional flaw regarding portability, the code logic is transparent and strictly follows its stated purpose. There are no indicators of data exfiltration, malicious network activity, or unauthorized system modifications.
能力评估
Purpose & Capability
The name/description (collect category links and produce CSVs) matches the actual scripts: functions extract_domain, extract_category_path, parse_category_hierarchy and collect_category_links implement that. However the package hardcodes a user-specific default output directory (/Users/zhangqirong/工作/caiji) in multiple places (SKILL.md, config/settings.json, collect_categories.collect_category_links default). That absolute path is unrelated to the skill's purpose and is surprising for a generic skill.
Instruction Scope
SKILL.md and README describe only local parsing and CSV generation; the runtime instructions do not request any credentials or network access. The code likewise performs purely local parsing and file writes. The only scope concern is the hardcoded default output directory (will write files to /Users/zhangqirong/工作/caiji unless overridden), which is a surprising side-effect but not external exfiltration.
Install Mechanism
There is no install spec (instruction-only from the platform's perspective). Provided code uses standard Python libraries and pandas; nothing is downloaded from arbitrary URLs or installed automatically by the skill bundle.
Credentials
The skill requests no environment variables or credentials (good). But it writes files by default to a fixed absolute path in a particular user's home; this implicit filesystem access is disproportionate to an innocuous parser unless the user explicitly overrides output_dir. The bundle also depends on pandas (declared in SKILL.md).
Persistence & Privilege
always is false and the skill does not request any platform-level persistent privileges. It writes CSV files to disk (its own data), which is normal for this utility. There is no evidence it modifies other skills or system settings.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install category-link-collector - 安装完成后,直接呼叫该 Skill 的名称或使用
/category-link-collector触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
初始版本:支持电商分类链接采集,自动提取多级分类并保存为CSV文件
元数据
常见问题
Category Link Collector 是什么?
采集电商网站分类链接信息,提取分类层级数据并保存为CSV文件。当需要从电商网站分类链接中提取结构化数据时使用此技能。 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 219 次。
如何安装 Category Link Collector?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install category-link-collector」即可一键安装,无需额外配置。
Category Link Collector 是免费的吗?
是的,Category Link Collector 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Category Link Collector 支持哪些平台?
Category Link Collector 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Category Link Collector?
由 QirongZhang(@qirongzhang)开发并维护,当前版本 v1.0.0。
推荐 Skills