← Back to Skills Marketplace
95
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install ecommerce-scraper-pro
Description
智能数据抓取工具 - 从网页/API 提取结构化数据,支持批量处理
README (SKILL.md)
Data Scraper - 智能数据抓取工具
从网页、API 自动提取结构化数据,支持批量处理和多种输出格式。
功能特性
- 🕷️ 网页数据抓取 - 自动识别并提取目标数据
- 📊 结构化输出 - JSON、CSV、Excel 格式
- 🔄 批量处理 - 支持多页面/多 URL 批量抓取
- 🛡️ 反爬规避 - 智能请求频率控制
- 🔌 API 集成 - 支持 REST/GraphQL API
- 📝 数据清洗 - 自动去重、格式化
使用方法
基础用法
# 抓取单个网页
uv run scripts/data-scraper.py scrape --url "https://example.com/products" --selector ".product"
# 抓取多个页面
uv run scripts/data-scraper.py scrape --urls-file urls.txt --output data.json
# 从 API 获取数据
uv run scripts/data-scraper.py api --endpoint "https://api.example.com/data" --auth "Bearer TOKEN"
高级选项
# 指定输出格式
uv run scripts/data-scraper.py scrape --url "https://example.com" --format csv --output products.csv
# 设置请求延迟(避免被封)
uv run scripts/data-scraper.py scrape --url "https://example.com" --delay 2
# 使用代理
uv run scripts/data-scraper.py scrape --url "https://example.com" --proxy "http://proxy:port"
# 定时抓取
uv run scripts/data-scraper.py scrape --url "https://example.com" --schedule "0 */6 * * *"
支持的数据类型
| 类型 | 描述 | 示例 |
|---|---|---|
product |
电商产品 | 价格、名称、评分、库存 |
article |
新闻/博客 | 标题、作者、日期、内容 |
job |
招聘信息 | 职位、公司、薪资、要求 |
real_estate |
房产信息 | 价格、面积、位置、户型 |
social |
社交媒体 | 帖子、评论、点赞数 |
custom |
自定义 | 通过 CSS/XPath 选择器定义 |
输出格式
JSON(默认)
{
"url": "https://example.com",
"scrapedAt": "2026-02-28T01:13:00Z",
"data": [
{
"title": "产品标题",
"price": "$99.99",
"rating": 4.5
}
]
}
CSV
title,price,rating,url
产品标题,$99.99,4.5,https://...
Excel
- 多工作表支持
- 自动格式化
- 数据透视表
定价建议
| 版本 | 功能 | 价格 |
|---|---|---|
| 基础版 | 单次抓取,100 页/月 | $49 |
| 专业版 | 批量抓取,1000 页/月,定时任务 | $149 |
| 企业版 | 无限抓取,API 访问,定制支持 | $499 |
示例
电商产品价格监控
输入:
uv run scripts/data-scraper.py scrape \
--url "https://amazon.com/s?k=wireless+headphones" \
--type product \
--fields "title,price,rating,reviews" \
--output headphones.json
输出:
{
"scrapedAt": "2026-02-28T01:13:00Z",
"count": 50,
"data": [
{
"title": "Sony WH-1000XM5",
"price": "$349.99",
"rating": 4.7,
"reviews": 12453
}
]
}
招聘信息抓取
输入:
uv run scripts/data-scraper.py scrape \
--url "https://linkedin.com/jobs/search?keywords=python+developer" \
--type job \
--fields "title,company,location,salary" \
--output jobs.csv
技术实现
- 使用 Playwright/BeautifulSoup 进行网页解析
- 支持 JavaScript 渲染页面
- 智能重试和错误处理
- 可集成到 OpenClaw 工作流
注意事项
⚠️ 合法合规使用
- 遵守目标网站 robots.txt
- 不要过度请求导致服务器压力
- 尊重数据版权和隐私
- 仅抓取公开数据
更新日志
v0.1.0 (2026-02-28)
- 初始版本发布
- 支持基础网页抓取
- 支持 JSON/CSV 输出
- 支持批量处理
待开发功能
- 图形化配置界面
- 数据可视化
- 自动字段识别
- 云存储集成
- 实时监控告警
开发者: VIC ai-company
许可: MIT
支持: 联系 main agent
Usage Guidance
This skill is internally consistent with its stated purpose, but take these precautions before installing or running it: 1) Review the scripts/data-scraper.py source yourself (it is included) to confirm it matches your expectations. 2) Install dependencies from official package sources (pip) in a virtual environment. 3) Do not pass sensitive credentials to the tool unless necessary; when using --auth prefer scoped/temporary tokens. 4) Respect target sites' robots.txt, terms of service, and privacy laws — examples in the docs (Amazon, LinkedIn) may violate their terms. 5) Use rate limiting (--delay) and proxies responsibly to avoid IP blocking or causing service issues. 6) Run the tool in a sandbox or with limited filesystem/network permissions if you do not fully trust the publisher. 7) Verify what your platform's "uv run" runner does (it executes the bundled script) so you understand execution context.
Capability Analysis
Type: OpenClaw Skill
Name: ecommerce-scraper-pro
Version: 1.0.1
The skill bundle provides a legitimate data scraping tool for extracting structured information from websites and APIs. The Python script (scripts/data-scraper.py) uses standard libraries like requests and BeautifulSoup, implements polite scraping with delays, and lacks any indicators of data exfiltration, unauthorized execution, or malicious prompt injection.
Capability Assessment
Purpose & Capability
The name/description (web/API data scraper) match the included files and code. Required libraries (requests, beautifulsoup4) are appropriate and the code implements scraping, batch processing, and output formatting as described.
Instruction Scope
SKILL.md and README instruct the agent/user to run the local Python script (uv run / python3 scripts/data-scraper.py) and pass URLs, optional auth, proxy, delay, and output paths. The instructions do not ask the agent to read unrelated system files or secrets. Note: examples show scraping high-profile sites (Amazon, LinkedIn); users should consider legal/terms-of-service and privacy implications before using the tool on such targets.
Install Mechanism
No install spec is embedded in the skill bundle (instruction-only from platform perspective). Dependencies are listed in requirements.txt and README recommends pip install -r requirements.txt — a standard, low-risk approach. There are no downloads from untrusted URLs or archive extraction steps in the skill metadata.
Credentials
The skill requests no environment variables or credentials in metadata. The CLI accepts an --auth argument and --proxy, which is appropriate for interacting with authenticated APIs or proxies. Exercise caution about supplying sensitive credentials to any third-party tool; this skill does not appear to transmit credentials to unexpected endpoints, but it will send provided auth to target endpoints.
Persistence & Privilege
The skill does not request always:true and does not modify other skills or global agent settings. It runs as a local script and writes outputs to user-specified paths only.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install ecommerce-scraper-pro - After installation, invoke the skill by name or use
/ecommerce-scraper-pro - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.1
- Updated skill name and description to "data-scraper" with broader data extraction capabilities.
- Expanded supported features: now includes API integration, structured outputs (JSON, CSV, Excel), bulk processing, anti-blocking measures, automatic data cleaning, and scheduling options.
- Added usage instructions for web and API data scraping, output formatting, proxy support, and scheduled tasks.
- Included detailed data type and output format support, new pricing tiers, and real-world usage examples for multiple scenarios.
- Outlined technical implementation notes and compliance guidelines.
- Published initial release notes and outlined planned future features.
Metadata
Frequently Asked Questions
What is E-commerce Data Scraper Pro?
智能数据抓取工具 - 从网页/API 提取结构化数据,支持批量处理. It is an AI Agent Skill for Claude Code / OpenClaw, with 95 downloads so far.
How do I install E-commerce Data Scraper Pro?
Run "/install ecommerce-scraper-pro" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is E-commerce Data Scraper Pro free?
Yes, E-commerce Data Scraper Pro is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does E-commerce Data Scraper Pro support?
E-commerce Data Scraper Pro is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created E-commerce Data Scraper Pro?
It is built and maintained by chungvic (@chungvic); the current version is v1.0.1.
More Skills