← 返回 Skills 市场
windseeker1111

FlowCrawl — Stealth Web Scraper That Bypasses Everything

作者 windseeker1111 · GitHub ↗ · v1.1.0 · MIT-0
cross-platform ⚠ suspicious
419
总下载
0
收藏
0
当前安装
4
版本数
在 OpenClaw 中安装
/install flowcrawl
功能描述
Stealth web scraper. Give it any URL and it punches through Cloudflare, bot detection, and WAFs automatically using a 3-tier cascade (plain HTTP → TLS spoof...
使用说明 (SKILL.md)

FlowCrawl

Scrape any website. Bypass any bot protection. Free.

Install Scrapling First

pip install scrapling

Scrapling installs Playwright automatically on first run. That's the only dependency.

Quick Usage

# Single URL — prints clean markdown to stdout
python3 ~/clawd/skills/flowcrawl/scripts/flowcrawl.py https://example.com

# Spider the whole site
python3 ~/clawd/skills/flowcrawl/scripts/flowcrawl.py https://example.com --deep

# Deep crawl with limits, save and combine
python3 ~/clawd/skills/flowcrawl/scripts/flowcrawl.py https://example.com --deep --limit 30 --combine

# JSON output — pipe into anything
python3 ~/clawd/skills/flowcrawl/scripts/flowcrawl.py https://example.com --json

Add Alias (Recommended)

echo 'alias flowcrawl="python3 ~/clawd/skills/flowcrawl/scripts/flowcrawl.py"' >> ~/.zshrc
source ~/.zshrc

Then just: flowcrawl https://example.com

How It Works

FlowCrawl uses a 3-tier fetcher cascade. Starts fast, escalates only when blocked:

Tier Method Handles
1 Plain HTTP Most sites, instant
2 Stealth + TLS spoof Cloudflare, Imperva, basic WAFs
3 Full JS execution SPAs, heavy JS, aggressive bot detection

Auto-detects blocking (403, 503, "Just a moment...") and escalates silently.

All Options

Flag Description Default
--deep Spider whole site following internal links off
--depth N Max hop depth from start URL 3
--limit N Max pages to crawl 50
--combine Merge all pages into one file off
--format md|txt Output format md
--output DIR Output directory ./flowcrawl-output
--json Structured JSON output off
--quiet Suppress progress logs off
安全使用建议
This skill is coherent with its stated aim of bypassing bot protections, but that purpose is inherently risky and may violate site terms or laws. Before installing: 1) Decide whether evading WAFs/Cloudflare is appropriate and legal for your use case — don’t use on sites you don’t own or without permission. 2) Review the scrapling project source and trustworthiness (pip package + GitHub repo) because installing it will bring Playwright and download browser binaries. 3) Be aware the README suggests modifying ~/.zshrc (adds an alias); only do this if you want that persistent change. 4) Run in an isolated environment (VM/container) if you want to reduce risk of surprising downloads or side effects. 5) If you plan to use this in production or in an automated agent, consider legal/ethical review and logging/limits to avoid abusive scraping. If you want a lower-risk option, prefer tools that respect robots.txt and avoid active fingerprint spoofing.
功能分析
Type: OpenClaw Skill Name: flowcrawl Version: 1.1.0 FlowCrawl is a web scraping utility that implements a three-tier escalation strategy (plain HTTP, TLS spoofing, and full JS execution) using the 'scrapling' library to bypass bot protections. The Python script in `scripts/flowcrawl.py` contains standard crawling logic, markdown extraction, and local file management without any evidence of data exfiltration, unauthorized network calls, or malicious execution. While `SKILL.md` suggests adding a shell alias to `~/.zshrc`, this is presented as a documented convenience for CLI usage rather than a hidden persistence mechanism.
能力评估
Purpose & Capability
The name/description (stealth scraper that 'punches through Cloudflare/WAFs') align with the included code and SKILL.md: the CLI uses a three-tier escalation (plain HTTP → stealth/TLS spoof → full JS via Playwright). No unrelated credentials or config are requested. The claim 'No CDP Chrome' is potentially misleading because Playwright and stealth tooling are used—functionally this is a browser-automation based bypass stack, which matches the stated purpose but the marketing is aggressive and possibly inaccurate.
Instruction Scope
SKILL.md instructs the user to pip install scrapling (which will pull Playwright and stealth plugins) and to add an alias to the user's shell rc (~/.zshrc). The runtime instructions and code explicitly escalate to evasion techniques (TLS fingerprint spoofing, stealth plugins, full JS execution) to bypass protections — behavior that intentionally evades server-side defenses and could violate terms of service or laws. The skill does not attempt to read unrelated local files, nor does it exfiltrate data to external endpoints, but it does modify user shell config via the recommended alias and triggers external downloads when installed or run.
Install Mechanism
There is no registry install spec, but SKILL.md requires 'pip install scrapling'. Scrapling will install Playwright and (on first run) download browser binaries — a network-driven install that writes binaries to disk. The lack of a formal install spec in the registry plus the implicit heavy runtime dependency (Playwright/browser downloads) is a practical installation risk and should be made explicit to users. The pip/Playwright download is from public registries, not an unknown URL, but can be large and perform additional network activity.
Credentials
The skill requests no environment variables, no credentials, and no special config paths. That is proportionate to a local scraper tool. There are no declared requirements for unrelated secrets or remote service keys.
Persistence & Privilege
The skill is user-invocable and not 'always: true' (no elevated persistent privilege). However SKILL.md recommends adding an alias to ~/.zshrc which writes to the user's shell config — a mild, user-visible persistence action. Playwright will also place browser artifacts on disk. The skill does not modify other skills or system-wide OpenClaw settings.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install flowcrawl
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /flowcrawl 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.1.0
Stealth web scraper. Punches through Cloudflare, bot detection, and WAFs using a 3-tier cascade (plain HTTP, TLS spoof, full JS). No API keys, no proxies, no CDP Chrome. Free from the Flow team.
v1.0.2
Version 1.0.2 of FlowCrawl - No file changes were detected in this version. - Functionality, documentation, and options remain unchanged.
v1.0.1
- Updated SKILL.md with improved description and branding. - Clarified usage and description to emphasize FlowCrawl’s ability to bypass bot protection. - Adjusted skill name casing and authorship notes. - No code changes; documentation only.
v1.0.0
Initial release of FlowCrawl, a stealth web scraper that bypasses Cloudflare and bot protections. - Introduces a 3-tier cascade for web scraping: plain HTTP → TLS fingerprint spoofing → full JS execution. - Requires Scrapling (installs Playwright on first use) as the only dependency. - Offers CLI usage for scraping single URLs, deep site crawling, output in markdown or JSON, and output combining. - Includes flags for crawl depth, page limits, output format, and quiet mode. - Automatically detects and escalates around site blocks, supporting most modern anti-bot protections.
元数据
Slug flowcrawl
版本 1.1.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 4
常见问题

FlowCrawl — Stealth Web Scraper That Bypasses Everything 是什么?

Stealth web scraper. Give it any URL and it punches through Cloudflare, bot detection, and WAFs automatically using a 3-tier cascade (plain HTTP → TLS spoof... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 419 次。

如何安装 FlowCrawl — Stealth Web Scraper That Bypasses Everything?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install flowcrawl」即可一键安装,无需额外配置。

FlowCrawl — Stealth Web Scraper That Bypasses Everything 是免费的吗?

是的,FlowCrawl — Stealth Web Scraper That Bypasses Everything 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

FlowCrawl — Stealth Web Scraper That Bypasses Everything 支持哪些平台?

FlowCrawl — Stealth Web Scraper That Bypasses Everything 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 FlowCrawl — Stealth Web Scraper That Bypasses Everything?

由 windseeker1111(@windseeker1111)开发并维护,当前版本 v1.1.0。

💬 留言讨论