← 返回 Skills 市场
Scrapling Fetch Pro
作者
shuxiangfanclaw
· GitHub ↗
· v1.2.0
· MIT-0
131
总下载
0
收藏
0
当前安装
2
版本数
在 OpenClaw 中安装
/install scrapling-fetch-pro
功能描述
专业网页抓取工具,完整支持微信公众号文章爬取、自动模式检测、噪音清理。适合抓取博客、新闻、公告及各类有反爬保护的网站。
安全使用建议
Things to consider before installing/using this skill:
- Provenance: the package has no homepage and an unknown source/owner. Prefer code from known sources.
- Version/metadata mismatch: SKILL.md claims v1.2.0 while _meta.json and the script header show v1.1.0 — this could indicate sloppy packaging or partial updates.
- Promised "Cloudflare Turnstile" bypass is not implemented in the visible code; stealth behavior is delegated to scrapling.fetchers.StealthyFetcher. Inspect that external library before trusting the bypass claim.
- Dependencies: Playwright will download browser binaries at runtime and executes page JavaScript (normal for stealth scraping). Run in a sandboxed environment and be aware of large network/download side-effects.
- Legal/ethical risk: scraping WeChat and sites protected by anti-bot measures may violate terms of service or local law. Ensure you have the right to scrape target sites.
- Recommended actions: review the scrapling package (scrapling.fetchers) source, verify the StealthyFetcher implementation, run the tool in an isolated environment (container/VM), and only provide it access to target URLs you control or are permitted to scrape.
功能分析
Type: OpenClaw Skill
Name: scrapling-fetch-pro
Version: 1.2.0
The skill bundle is a specialized web scraping tool designed to extract content from websites, with specific optimizations for WeChat Official Accounts and anti-bot bypass (Cloudflare). The Python script `scripts/scrapling_fetch.py` uses legitimate libraries like `scrapling`, `playwright`, and `beautifulsoup4` to perform its stated functions. There is no evidence of data exfiltration, unauthorized execution, or malicious intent; the stealth and evasion capabilities are transparently documented as features for legitimate scraping purposes.
能力评估
Purpose & Capability
Name/description claim a professional scraper with WeChat and anti-bot bypass features; the included script implements selector-based scraping, WeChat noise removal, basic/stealth modes and Markdown output, which is coherent with the stated purpose. However the README claims automatic Cloudflare Turnstile bypass and other advanced anti-bot techniques while the script delegates stealth behavior to an external StealthyFetcher (scrapling.fetchers) — the bypass behavior is not visible in the shipped code. Also metadata files and the script show inconsistent version numbers (SKILL.md says 1.2.0, _meta.json and script header show 1.1.0), and source/homepage are unknown.
Instruction Scope
SKILL.md and references instruct running the included Python script and describe modes/flags; they do not direct the agent to read unrelated files, exfiltrate environment variables, or call unrelated external endpoints. The runtime instructions are narrowly scoped to scraping tasks. Note: they refer to Sessions and Cloudflare bypass in prose but do not include concrete config or credential usage in the included files.
Install Mechanism
There is no install spec (instruction-only + code file). That is lowest-install risk, but the script depends on several heavy packages (playwright, patchright, scrapling, html2text, beautifulsoup4, lxml). Playwright in particular typically downloads browser binaries at runtime which has additional network/file implications. Because there is no provided install or provenance, it's unclear how those dependencies should be installed or whether the 'scrapling' package (and its StealthyFetcher) is trustworthy.
Credentials
The skill declares no required environment variables, credentials, or config paths and the code does not read env vars. That is proportionate to the stated purpose. Note that stealth scraping may require cookies/sessions for logged-in pages (the docs mention Sessions) but no session-handling credentials are requested by the skill as packaged.
Persistence & Privilege
The skill is not marked always:true and does not request persistent system modifications. It is user-invocable and can be called autonomously by the agent (default behavior), which is normal. No code attempts to modify other skills or global agent settings.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install scrapling-fetch-pro - 安装完成后,直接呼叫该 Skill 的名称或使用
/scrapling-fetch-pro触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.2.0
更改许可证从 MIT-0 到 MIT(需要署名)
v1.1.0
- 新增完整支持微信公众号文章抓取及专用标题选择器
- 引入自动模式检测,根据 URL 智能选择抓取模式(basic/stealth/auto)
- 强化噪音清理,自动移除公众号文章中的广告、工具栏等无用内容
- 增加正文选择器至16个,提升各类网站适配能力
- 支持 Cloudflare Turnstile 绕过及浏览器指纹伪装,增强反爬能力
元数据
常见问题
Scrapling Fetch Pro 是什么?
专业网页抓取工具,完整支持微信公众号文章爬取、自动模式检测、噪音清理。适合抓取博客、新闻、公告及各类有反爬保护的网站。 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 131 次。
如何安装 Scrapling Fetch Pro?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install scrapling-fetch-pro」即可一键安装,无需额外配置。
Scrapling Fetch Pro 是免费的吗?
是的,Scrapling Fetch Pro 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Scrapling Fetch Pro 支持哪些平台?
Scrapling Fetch Pro 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Scrapling Fetch Pro?
由 shuxiangfanclaw(@shuxiangfanclaw)开发并维护,当前版本 v1.2.0。
推荐 Skills