← Back to Skills Marketplace

Panscrapling Web Scraper

Name: Panscrapling Web Scraper
Author: dashiming

by dashiming · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ⚠ suspicious

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install panscrapling-web-scraper

Description

强大的网页抓取技能。基于 Scrapling，自动绕过 Cloudflare/反爬系统。触发词：抓取网页、爬取、scrape、fetch、抓取内容、提取网页、获取页面。使用场景： (1) 抓取被 Cloudflare 保护的网页 (2) 提取页面内容 (3) 网页数据采集 (4) 动态渲染页面抓取自动安装：...

Usage Guidance

Before installing or running this skill: - Expect it to modify your system: setup.py will try to install Homebrew (via curl|bash), install Python 3.11 via Homebrew, pip-install packages (with an online fallback), and download Playwright/Patchright browser binaries. Run only on machines where you accept these changes. - The SKILL.md claims 'embedded wheels' and 'offline install', but the installer falls back to online network installs — verify the wheels/packaged files are present in the provided wheels/ directory if you need offline usage. - There are no integrity checks for downloaded scripts or packages. If you want to proceed, review the included wheels and the full setup.py and fetch.py contents locally, or run the installer in a sandboxed/VM environment. - The installer is macOS/Homebrew-centric; on other OSes it may fail or behave unexpectedly. There is no OS restriction declared. - The skill claims 'automatic bypass of Cloudflare/Turnstile' — that capability can be used for legitimate scraping but also to circumvent protections. Ensure your intended scraping is legal and compliant with target site policies. - If you cannot review code/wheels yourself, avoid installing. Prefer running this inside an isolated environment (container/VM) and inspect network activity during the first run. If you need offline assurance, confirm the wheels/ directory contains the expected whl files and avoid letting the script run its online fallback.

Capability Analysis

Type: OpenClaw Skill Name: panscrapling-web-scraper Version: 1.0.0 The skill performs high-risk system-level operations in `scripts/setup.py`, including the automated installation of Homebrew via a `curl | bash` command from a remote GitHub URL and the installation of multiple browser engines and Python packages. While these actions are consistent with the stated goal of setting up a stealthy web scraping environment, the automated execution of remote scripts and broad environment modifications represent a significant attack surface. No clear evidence of intentional malice, such as data exfiltration or backdoors, was found in the provided code.

Capability Assessment

ℹ Purpose & Capability

The code and instructions align with a browser-driven web scraper that uses Scrapling/Playwright/Patchright and installs Python and browsers. However, the SKILL.md claims fully-embedded/offline operation while the setup.py contains explicit online fallback behavior (install Homebrew via curl, pip from an online index) and Homebrew-only install paths — yet the skill metadata has no OS restriction. The macOS/Homebrew-centric install logic and online fallbacks are inconsistent with the 'offline embedded' claim and the lack of OS limitation.

⚠ Instruction Scope

Runtime instructions and scripts will automatically run system-level installation steps: installing Homebrew, installing Python, pip installing packages, and downloading browser binaries. SKILL.md asserts 'no third-party API calls' and 'all requests through local browser', but the installer performs network operations (curl Homebrew install, pip fallback to remote indices, Playwright/patchright downloads). The installer also attempts to modify system state without an explicit interactive consent step in the skill instructions.

⚠ Install Mechanism

There is no registry install spec; instead the included setup.py runs subprocesses that execute network installs (curl | bash for Homebrew, pip installs, playwright/patchright browser downloads). Using GitHub raw for Homebrew installation and a Tsinghua PyPI mirror are common but are online network actions executed without integrity checks. The script first prefers local wheels (good) but explicitly falls back to online installation; downloads are executed without signature or checksum verification.

✓ Credentials

The skill requests no environment variables or credentials, which is appropriate for a scraper. However, it does require filesystem and network access to install software and to download browser binaries; that is expected for this functionality but is impactful and should be considered by the user.

ℹ Persistence & Privilege

The skill is not force-enabled (always:false) and does not request credentials, but its installer will change the host system by installing Homebrew/Python and downloading browsers—persistent system modifications beyond the skill's own files. This is functionally expected for a scraper needing a browser runtime but increases the blast radius and should be noted.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install panscrapling-web-scraper
After installation, invoke the skill by name or use /panscrapling-web-scraper
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

Panscrapling Web Scraper 1.0.0 - Initial release with powerful web scraping capabilities based on Scrapling. - Automatically bypasses Cloudflare/anti-bot protections, including Turnstile. - Fully embedded distribution: includes all Python dependencies and supports offline installation. - Features fast, stealthy, and dynamic scraping modes for a variety of web pages. - Automatic installation of Python 3.10+, dependencies, and browser components on first use. - CLI tool supports element extraction, Markdown output, media/meta extraction, and setup.

Metadata

Slug panscrapling-web-scraper

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Panscrapling Web Scraper?

强大的网页抓取技能。基于 Scrapling，自动绕过 Cloudflare/反爬系统。触发词：抓取网页、爬取、scrape、fetch、抓取内容、提取网页、获取页面。使用场景： (1) 抓取被 Cloudflare 保护的网页 (2) 提取页面内容 (3) 网页数据采集 (4) 动态渲染页面抓取自动安装：... It is an AI Agent Skill for Claude Code / OpenClaw, with 72 downloads so far.

How do I install Panscrapling Web Scraper?

Run "/install panscrapling-web-scraper" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Panscrapling Web Scraper free?

Yes, Panscrapling Web Scraper is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Panscrapling Web Scraper support?

Panscrapling Web Scraper is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Panscrapling Web Scraper?

It is built and maintained by dashiming (@dashiming); the current version is v1.0.0.

More Skills