pepper-oil-scraper

Name: pepper-oil-scraper
Author: majorlau

功能描述

爬取花椒油、藤椒油产业链相关数据的专用技能。覆盖市场规模、原料价格、企业财报、进出口、行业报告、竞争格局等多维度数据源，内置 20+ 重点网站的爬虫适配器。当用户需要采集花椒/藤椒/调味油/Sichuan Pepper Oil 相关产业数据时触发此技能。即使用户只说"爬数据""抓取报告""采集价格"，只要上...

安全使用建议

This package appears to be a legitimate multi-site Python web scraper for pepper/pepper-oil industry data, but review these before installing: - Install step mismatch: the skill metadata lists an install of kind "node" while the project is Python. Don't rely on the metadata installer — create a Python virtual environment and run the SKILL.md pip install command yourself (pip install requests beautifulsoup4 lxml pandas openpyxl aiohttp fake-useragent). Avoid using --break-system-packages unless you understand its effects. - Inspect config/targets.json before running: it can contain site list, proxy settings, company lists or HS codes; make sure no private or unexpected endpoints/proxies are configured. - Run in an isolated environment (container or VM) and as a non-root user to limit blast radius in case of mistakes. - Legal/ethical caution: the tool is a crawler. Verify target sites' robots.txt and terms of service and avoid scraping pages that require authentication or are behind paywalls. The code includes anti-anti-crawl techniques (fake-useragent, proxy support, delay/backoff) — be careful and lawful when using them. - Confirm optional dependencies: JS-heavy sites mention playwright — that requires separate installation and can be large; only install if needed. If you want, I can (a) list the exact packages and versions to install in a virtualenv, (b) parse config/targets.json for suspicious proxy/URL entries, or (c) point out any adapters that mention downloading PDFs or interacting with APIs so you can audit those specific behaviors.

功能分析

Type: OpenClaw Skill Name: pepper-oil-scraper Version: 1.0.0 The pepper-oil-scraper bundle is a comprehensive and well-structured toolset for gathering industry data related to Sichuan pepper oil. It contains 26 specialized adapters in the 'scripts/adapters/' directory for scraping market prices, corporate filings, and government reports from legitimate sources like cninfo.com.cn and forestry.gov.cn. The code follows standard scraping practices, including rate limiting, user-agent rotation, and data standardization, with no evidence of malicious intent, data exfiltration, or prompt injection.

能力评估

✓ Purpose & Capability

Name/description claim a Python-based scraper for pepper/pepper-oil industry data from ~20 sites; the package contains many Python scraper adapters, a main scheduler, data-cleaner and export tools and a config.targets.json listing sites — all consistent with the stated purpose.

✓ Instruction Scope

SKILL.md gives concrete pip install commands and examples to run the Python scripts. The runtime instructions only describe scraping public websites and producing local JSON/XLSX outputs; they do not instruct reading unrelated system files or sending data to hidden endpoints. They do suggest optional proxy and playwright usage for JS sites.

⚠ Install Mechanism

The metadata.install entry uses kind: "node" (id: pip-deps) despite the project being pure Python and the SKILL.md instructing pip installs. This mismatch looks like a packaging/metadata error and may mean the declared install step won't run as intended. The SKILL.md itself asks the user to run pip install (no external archive downloads or obscure URLs), so risk is low but the metadata inconsistency should be fixed or the user should manually install Python deps in a venv.

✓ Credentials

The skill requests no environment variables or credentials. The code reads only its included config/targets.json and writes outputs to the configured output directory. No credentials or unrelated secrets are requested or referenced in the provided files.

✓ Persistence & Privilege

Flags are default (always:false, user-invocable:true). The skill does not request persistent platform privileges or modify other skills/configs. It runs local Python scripts and saves results to the filesystem only.

版本历史

v1.0.0

pepper-oil-scraper 1.0.0 - Initial release of a specialized scraper for pepper oil and Sichuan pepper industry chain data. - Covers multi-dimensional sources including market size, raw material prices, company reports, imports/exports, industry analysis, and competition. - Contains adapters for 20+ major Chinese and global data sites, supporting both category-based and site-specific scraping. - Supports robust anti-crawling strategies: randomized delays, fake user agents, referer header, proxy pool support, and JS-rendering with playwright. - Built-in tools for standardized data output and Excel report exporting. - Data outputs include source_url, crawl_time, and original_text fields, with unified data units.

元数据

Slug pepper-oil-scraper

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

pepper-oil-scraper 是什么？

爬取花椒油、藤椒油产业链相关数据的专用技能。覆盖市场规模、原料价格、企业财报、进出口、行业报告、竞争格局等多维度数据源，内置 20+ 重点网站的爬虫适配器。当用户需要采集花椒/藤椒/调味油/Sichuan Pepper Oil 相关产业数据时触发此技能。即使用户只说"爬数据""抓取报告""采集价格"，只要上... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 127 次。

如何安装 pepper-oil-scraper？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install pepper-oil-scraper」即可一键安装，无需额外配置。

pepper-oil-scraper 是免费的吗？

是的，pepper-oil-scraper 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

pepper-oil-scraper 支持哪些平台？

pepper-oil-scraper 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 pepper-oil-scraper？

由 MajorLau（@majorlau）开发并维护，当前版本 v1.0.0。

pepper-oil-scraper 是什么？

如何安装 pepper-oil-scraper？

pepper-oil-scraper 是免费的吗？

pepper-oil-scraper 支持哪些平台？

谁开发了 pepper-oil-scraper？

💬 留言讨论