← 返回 Skills 市场

Panscrapling Web Scraper

Name: Panscrapling Web Scraper
Author: dashiming

作者 dashiming · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ⚠ suspicious

总下载

当前安装

版本数

在 OpenClaw 中安装

/install panscrapling-web-scraper

功能描述

强大的网页抓取技能。基于 Scrapling，自动绕过 Cloudflare/反爬系统。触发词：抓取网页、爬取、scrape、fetch、抓取内容、提取网页、获取页面。使用场景： (1) 抓取被 Cloudflare 保护的网页 (2) 提取页面内容 (3) 网页数据采集 (4) 动态渲染页面抓取自动安装：...

安全使用建议

Before installing or running this skill: - Expect it to modify your system: setup.py will try to install Homebrew (via curl|bash), install Python 3.11 via Homebrew, pip-install packages (with an online fallback), and download Playwright/Patchright browser binaries. Run only on machines where you accept these changes. - The SKILL.md claims 'embedded wheels' and 'offline install', but the installer falls back to online network installs — verify the wheels/packaged files are present in the provided wheels/ directory if you need offline usage. - There are no integrity checks for downloaded scripts or packages. If you want to proceed, review the included wheels and the full setup.py and fetch.py contents locally, or run the installer in a sandboxed/VM environment. - The installer is macOS/Homebrew-centric; on other OSes it may fail or behave unexpectedly. There is no OS restriction declared. - The skill claims 'automatic bypass of Cloudflare/Turnstile' — that capability can be used for legitimate scraping but also to circumvent protections. Ensure your intended scraping is legal and compliant with target site policies. - If you cannot review code/wheels yourself, avoid installing. Prefer running this inside an isolated environment (container/VM) and inspect network activity during the first run. If you need offline assurance, confirm the wheels/ directory contains the expected whl files and avoid letting the script run its online fallback.

功能分析

Type: OpenClaw Skill Name: panscrapling-web-scraper Version: 1.0.0 The skill performs high-risk system-level operations in `scripts/setup.py`, including the automated installation of Homebrew via a `curl | bash` command from a remote GitHub URL and the installation of multiple browser engines and Python packages. While these actions are consistent with the stated goal of setting up a stealthy web scraping environment, the automated execution of remote scripts and broad environment modifications represent a significant attack surface. No clear evidence of intentional malice, such as data exfiltration or backdoors, was found in the provided code.

能力评估

ℹ Purpose & Capability

The code and instructions align with a browser-driven web scraper that uses Scrapling/Playwright/Patchright and installs Python and browsers. However, the SKILL.md claims fully-embedded/offline operation while the setup.py contains explicit online fallback behavior (install Homebrew via curl, pip from an online index) and Homebrew-only install paths — yet the skill metadata has no OS restriction. The macOS/Homebrew-centric install logic and online fallbacks are inconsistent with the 'offline embedded' claim and the lack of OS limitation.

⚠ Instruction Scope

Runtime instructions and scripts will automatically run system-level installation steps: installing Homebrew, installing Python, pip installing packages, and downloading browser binaries. SKILL.md asserts 'no third-party API calls' and 'all requests through local browser', but the installer performs network operations (curl Homebrew install, pip fallback to remote indices, Playwright/patchright downloads). The installer also attempts to modify system state without an explicit interactive consent step in the skill instructions.

⚠ Install Mechanism

There is no registry install spec; instead the included setup.py runs subprocesses that execute network installs (curl | bash for Homebrew, pip installs, playwright/patchright browser downloads). Using GitHub raw for Homebrew installation and a Tsinghua PyPI mirror are common but are online network actions executed without integrity checks. The script first prefers local wheels (good) but explicitly falls back to online installation; downloads are executed without signature or checksum verification.

✓ Credentials

The skill requests no environment variables or credentials, which is appropriate for a scraper. However, it does require filesystem and network access to install software and to download browser binaries; that is expected for this functionality but is impactful and should be considered by the user.

ℹ Persistence & Privilege

The skill is not force-enabled (always:false) and does not request credentials, but its installer will change the host system by installing Homebrew/Python and downloading browsers—persistent system modifications beyond the skill's own files. This is functionally expected for a scraper needing a browser runtime but increases the blast radius and should be noted.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install panscrapling-web-scraper
安装完成后，直接呼叫该 Skill 的名称或使用 /panscrapling-web-scraper 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

Panscrapling Web Scraper 1.0.0 - Initial release with powerful web scraping capabilities based on Scrapling. - Automatically bypasses Cloudflare/anti-bot protections, including Turnstile. - Fully embedded distribution: includes all Python dependencies and supports offline installation. - Features fast, stealthy, and dynamic scraping modes for a variety of web pages. - Automatic installation of Python 3.10+, dependencies, and browser components on first use. - CLI tool supports element extraction, Markdown output, media/meta extraction, and setup.

元数据

Slug panscrapling-web-scraper

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

Panscrapling Web Scraper 是什么？

强大的网页抓取技能。基于 Scrapling，自动绕过 Cloudflare/反爬系统。触发词：抓取网页、爬取、scrape、fetch、抓取内容、提取网页、获取页面。使用场景： (1) 抓取被 Cloudflare 保护的网页 (2) 提取页面内容 (3) 网页数据采集 (4) 动态渲染页面抓取自动安装：... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 72 次。

如何安装 Panscrapling Web Scraper？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install panscrapling-web-scraper」即可一键安装，无需额外配置。

Panscrapling Web Scraper 是免费的吗？

是的，Panscrapling Web Scraper 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

Panscrapling Web Scraper 支持哪些平台？

Panscrapling Web Scraper 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Panscrapling Web Scraper？

由 dashiming（@dashiming）开发并维护，当前版本 v1.0.0。