← Back to Skills Marketplace
72
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install panscrapling-web-scraper
Description
强大的网页抓取技能。基于 Scrapling,自动绕过 Cloudflare/反爬系统。 触发词:抓取网页、爬取、scrape、fetch、抓取内容、提取网页、获取页面。 使用场景: (1) 抓取被 Cloudflare 保护的网页 (2) 提取页面内容 (3) 网页数据采集 (4) 动态渲染页面抓取 自动安装:...
Usage Guidance
Before installing or running this skill:
- Expect it to modify your system: setup.py will try to install Homebrew (via curl|bash), install Python 3.11 via Homebrew, pip-install packages (with an online fallback), and download Playwright/Patchright browser binaries. Run only on machines where you accept these changes.
- The SKILL.md claims 'embedded wheels' and 'offline install', but the installer falls back to online network installs — verify the wheels/packaged files are present in the provided wheels/ directory if you need offline usage.
- There are no integrity checks for downloaded scripts or packages. If you want to proceed, review the included wheels and the full setup.py and fetch.py contents locally, or run the installer in a sandboxed/VM environment.
- The installer is macOS/Homebrew-centric; on other OSes it may fail or behave unexpectedly. There is no OS restriction declared.
- The skill claims 'automatic bypass of Cloudflare/Turnstile' — that capability can be used for legitimate scraping but also to circumvent protections. Ensure your intended scraping is legal and compliant with target site policies.
- If you cannot review code/wheels yourself, avoid installing. Prefer running this inside an isolated environment (container/VM) and inspect network activity during the first run. If you need offline assurance, confirm the wheels/ directory contains the expected whl files and avoid letting the script run its online fallback.
Capability Analysis
Type: OpenClaw Skill
Name: panscrapling-web-scraper
Version: 1.0.0
The skill performs high-risk system-level operations in `scripts/setup.py`, including the automated installation of Homebrew via a `curl | bash` command from a remote GitHub URL and the installation of multiple browser engines and Python packages. While these actions are consistent with the stated goal of setting up a stealthy web scraping environment, the automated execution of remote scripts and broad environment modifications represent a significant attack surface. No clear evidence of intentional malice, such as data exfiltration or backdoors, was found in the provided code.
Capability Assessment
Purpose & Capability
The code and instructions align with a browser-driven web scraper that uses Scrapling/Playwright/Patchright and installs Python and browsers. However, the SKILL.md claims fully-embedded/offline operation while the setup.py contains explicit online fallback behavior (install Homebrew via curl, pip from an online index) and Homebrew-only install paths — yet the skill metadata has no OS restriction. The macOS/Homebrew-centric install logic and online fallbacks are inconsistent with the 'offline embedded' claim and the lack of OS limitation.
Instruction Scope
Runtime instructions and scripts will automatically run system-level installation steps: installing Homebrew, installing Python, pip installing packages, and downloading browser binaries. SKILL.md asserts 'no third-party API calls' and 'all requests through local browser', but the installer performs network operations (curl Homebrew install, pip fallback to remote indices, Playwright/patchright downloads). The installer also attempts to modify system state without an explicit interactive consent step in the skill instructions.
Install Mechanism
There is no registry install spec; instead the included setup.py runs subprocesses that execute network installs (curl | bash for Homebrew, pip installs, playwright/patchright browser downloads). Using GitHub raw for Homebrew installation and a Tsinghua PyPI mirror are common but are online network actions executed without integrity checks. The script first prefers local wheels (good) but explicitly falls back to online installation; downloads are executed without signature or checksum verification.
Credentials
The skill requests no environment variables or credentials, which is appropriate for a scraper. However, it does require filesystem and network access to install software and to download browser binaries; that is expected for this functionality but is impactful and should be considered by the user.
Persistence & Privilege
The skill is not force-enabled (always:false) and does not request credentials, but its installer will change the host system by installing Homebrew/Python and downloading browsers—persistent system modifications beyond the skill's own files. This is functionally expected for a scraper needing a browser runtime but increases the blast radius and should be noted.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install panscrapling-web-scraper - After installation, invoke the skill by name or use
/panscrapling-web-scraper - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Panscrapling Web Scraper 1.0.0
- Initial release with powerful web scraping capabilities based on Scrapling.
- Automatically bypasses Cloudflare/anti-bot protections, including Turnstile.
- Fully embedded distribution: includes all Python dependencies and supports offline installation.
- Features fast, stealthy, and dynamic scraping modes for a variety of web pages.
- Automatic installation of Python 3.10+, dependencies, and browser components on first use.
- CLI tool supports element extraction, Markdown output, media/meta extraction, and setup.
Metadata
Frequently Asked Questions
What is Panscrapling Web Scraper?
强大的网页抓取技能。基于 Scrapling,自动绕过 Cloudflare/反爬系统。 触发词:抓取网页、爬取、scrape、fetch、抓取内容、提取网页、获取页面。 使用场景: (1) 抓取被 Cloudflare 保护的网页 (2) 提取页面内容 (3) 网页数据采集 (4) 动态渲染页面抓取 自动安装:... It is an AI Agent Skill for Claude Code / OpenClaw, with 72 downloads so far.
How do I install Panscrapling Web Scraper?
Run "/install panscrapling-web-scraper" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Panscrapling Web Scraper free?
Yes, Panscrapling Web Scraper is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does Panscrapling Web Scraper support?
Panscrapling Web Scraper is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Panscrapling Web Scraper?
It is built and maintained by dashiming (@dashiming); the current version is v1.0.0.
More Skills