pepper-oil-scraper

Name: pepper-oil-scraper
Author: majorlau

Description

爬取花椒油、藤椒油产业链相关数据的专用技能。覆盖市场规模、原料价格、企业财报、进出口、行业报告、竞争格局等多维度数据源，内置 20+ 重点网站的爬虫适配器。当用户需要采集花椒/藤椒/调味油/Sichuan Pepper Oil 相关产业数据时触发此技能。即使用户只说"爬数据""抓取报告""采集价格"，只要上...

Usage Guidance

This package appears to be a legitimate multi-site Python web scraper for pepper/pepper-oil industry data, but review these before installing: - Install step mismatch: the skill metadata lists an install of kind "node" while the project is Python. Don't rely on the metadata installer — create a Python virtual environment and run the SKILL.md pip install command yourself (pip install requests beautifulsoup4 lxml pandas openpyxl aiohttp fake-useragent). Avoid using --break-system-packages unless you understand its effects. - Inspect config/targets.json before running: it can contain site list, proxy settings, company lists or HS codes; make sure no private or unexpected endpoints/proxies are configured. - Run in an isolated environment (container or VM) and as a non-root user to limit blast radius in case of mistakes. - Legal/ethical caution: the tool is a crawler. Verify target sites' robots.txt and terms of service and avoid scraping pages that require authentication or are behind paywalls. The code includes anti-anti-crawl techniques (fake-useragent, proxy support, delay/backoff) — be careful and lawful when using them. - Confirm optional dependencies: JS-heavy sites mention playwright — that requires separate installation and can be large; only install if needed. If you want, I can (a) list the exact packages and versions to install in a virtualenv, (b) parse config/targets.json for suspicious proxy/URL entries, or (c) point out any adapters that mention downloading PDFs or interacting with APIs so you can audit those specific behaviors.

Capability Analysis

Type: OpenClaw Skill Name: pepper-oil-scraper Version: 1.0.0 The pepper-oil-scraper bundle is a comprehensive and well-structured toolset for gathering industry data related to Sichuan pepper oil. It contains 26 specialized adapters in the 'scripts/adapters/' directory for scraping market prices, corporate filings, and government reports from legitimate sources like cninfo.com.cn and forestry.gov.cn. The code follows standard scraping practices, including rate limiting, user-agent rotation, and data standardization, with no evidence of malicious intent, data exfiltration, or prompt injection.

Capability Assessment

✓ Purpose & Capability

Name/description claim a Python-based scraper for pepper/pepper-oil industry data from ~20 sites; the package contains many Python scraper adapters, a main scheduler, data-cleaner and export tools and a config.targets.json listing sites — all consistent with the stated purpose.

✓ Instruction Scope

SKILL.md gives concrete pip install commands and examples to run the Python scripts. The runtime instructions only describe scraping public websites and producing local JSON/XLSX outputs; they do not instruct reading unrelated system files or sending data to hidden endpoints. They do suggest optional proxy and playwright usage for JS sites.

⚠ Install Mechanism

The metadata.install entry uses kind: "node" (id: pip-deps) despite the project being pure Python and the SKILL.md instructing pip installs. This mismatch looks like a packaging/metadata error and may mean the declared install step won't run as intended. The SKILL.md itself asks the user to run pip install (no external archive downloads or obscure URLs), so risk is low but the metadata inconsistency should be fixed or the user should manually install Python deps in a venv.

✓ Credentials

The skill requests no environment variables or credentials. The code reads only its included config/targets.json and writes outputs to the configured output directory. No credentials or unrelated secrets are requested or referenced in the provided files.

✓ Persistence & Privilege

Flags are default (always:false, user-invocable:true). The skill does not request persistent platform privileges or modify other skills/configs. It runs local Python scripts and saves results to the filesystem only.

Version History

v1.0.0

pepper-oil-scraper 1.0.0 - Initial release of a specialized scraper for pepper oil and Sichuan pepper industry chain data. - Covers multi-dimensional sources including market size, raw material prices, company reports, imports/exports, industry analysis, and competition. - Contains adapters for 20+ major Chinese and global data sites, supporting both category-based and site-specific scraping. - Supports robust anti-crawling strategies: randomized delays, fake user agents, referer header, proxy pool support, and JS-rendering with playwright. - Built-in tools for standardized data output and Excel report exporting. - Data outputs include source_url, crawl_time, and original_text fields, with unified data units.

Metadata

Slug pepper-oil-scraper

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is pepper-oil-scraper?

爬取花椒油、藤椒油产业链相关数据的专用技能。覆盖市场规模、原料价格、企业财报、进出口、行业报告、竞争格局等多维度数据源，内置 20+ 重点网站的爬虫适配器。当用户需要采集花椒/藤椒/调味油/Sichuan Pepper Oil 相关产业数据时触发此技能。即使用户只说"爬数据""抓取报告""采集价格"，只要上... It is an AI Agent Skill for Claude Code / OpenClaw, with 127 downloads so far.

How do I install pepper-oil-scraper?

Run "/install pepper-oil-scraper" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is pepper-oil-scraper free?

Yes, pepper-oil-scraper is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does pepper-oil-scraper support?

pepper-oil-scraper is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created pepper-oil-scraper?

It is built and maintained by MajorLau (@majorlau); the current version is v1.0.0.

More Skills

What is pepper-oil-scraper?

How do I install pepper-oil-scraper?

Is pepper-oil-scraper free?

Which platforms does pepper-oil-scraper support?

Who created pepper-oil-scraper?

💬 Comments