← 返回 Skills 市场
jaceagentic

CRE Scraper

作者 jaceagentic · GitHub ↗ · v2.0.0 · MIT-0
cross-platform ⚠ suspicious
96
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install cre-scraper
功能描述
Scrapes commercial real estate listings from Crexi and LoopNet using Claude in Chrome on a Mac Mini with residential IP. Bypasses Cloudflare bot protection....
使用说明 (SKILL.md)

CRE Scraper v2.0

Scrape commercial real estate listings from Crexi and LoopNet using Claude in Chrome.

Architecture

Mac Mini (residential IP + Chrome)
  → /scrape-crexi or /scrape-loopnet slash commands
  → ~/.openclaw/workspace/data/properties.db
  → rsync to VPS staging
  → sync-properties.py → Command Center dashboard

Requirements

  • macOS with Claude Code installed
  • Claude in Chrome browser extension active
  • Logged into Crexi (crexi.com) and LoopNet (loopnet.com) in Chrome
  • SSH key authorized on VPS
  • chromeEnabled: true in ~/.claude/settings.json

Usage

Run Crexi scrape (all 21 combinations):

~/.openclaw/skills/cre-scraper/run-scrape.sh

Run enrichment on unenriched properties:

~/.openclaw/skills/cre-scraper/enrich-batch.sh [batch_size]

Or inside Claude Code:

/scrape-crexi
/scrape-loopnet

Configuration

  • States: FL, GA, NC, TN, AL, LA, ID
  • Asset types: rv_park, self_storage, marina
  • Price range: $800K–$3M
  • Min units: 50+ (when known)
  • Value-add threshold: VAS ≥ 40

What gets scraped

Per listing:

  • Address, city, state, zip
  • Asking price, cap rate, NOI, occupancy
  • Units/pads/slips, SF, year built, acreage
  • Pro-forma cap rate and NOI
  • Broker name, firm, full phone (click-reveal)
  • Description and investment highlights
  • AI analysis: IRR, DSCR, Cash-on-Cash, Value-Add Score, AI Confidence

Cron schedule (launchd)

  • 7:00am — Crexi scrape (ai.crexi.scraper)
  • 8:00am — LoopNet scrape (ai.loopnet.scraper)
  • Midnight — Enrichment batch (ai.crexi.enricher)

Trigger phrases

  • "scrape new deals"
  • "run the Crexi scraper"
  • "find new RV parks in Florida"
  • "check LoopNet for self storage in Tennessee"
  • "enrich unenriched properties"
  • "sync deals to dashboard"

Output

Properties saved to ~/.openclaw/workspace/data/properties.db and synced to OpenClaw Command Center dashboard via sync-properties.py.

安全使用建议
Do not install or run this skill until you confirm a few things. Key concerns: (1) The package includes a session.json with Cloudflare clearance and many cookies — that lets the scraper bypass bot protections and may include someone else's session or sensitive tokens; don't use a session file you don't fully trust. (2) Scripts rsync the local DB to [email protected] and SSH to run a remote script — that sends scraped data (and could send session-derived data) to an unknown server. If you intend to use this, replace the remote host with a server you control, remove or regenerate the included session.json, and only grant SSH access to a dedicated key/account. (3) The code expects LOOPNET_EMAIL / LOOPNET_PASS env vars but they are not declared in registry metadata — supplying credentials to undeclared code is risky. (4) Check legal/ToS implications of scraping these sites. If you want help making the skill safer: remove the bundled session.json, add clear declarations for required env vars and remote hosts, and change rsync/ssh endpoints to your own infrastructure. If you cannot verify the owner or purpose of the remote VPS, treat this skill as unsafe.
功能分析
Type: OpenClaw Skill Name: cre-scraper Version: 2.0.0 The skill bundle contains scripts (run-scrape.sh, enrich-batch.sh) that automatically sync a local SQLite database containing scraped real estate data and broker contacts to a hardcoded external IP address (187.77.140.113) via rsync and execute remote commands via SSH as the root user. It also includes a session.json file containing active session cookies and internal configuration data for Crexi, which could lead to session hijacking if the bundle is shared. While these behaviors are consistent with the stated purpose of syncing to a 'Command Center' dashboard, the use of a hardcoded Brazilian IP and root SSH access are high-risk patterns that facilitate unauthorized data exfiltration.
能力评估
Purpose & Capability
Name/description (CRE scraping) matches the code: Playwright-based scrapers for Crexi and LoopNet, local SQLite storage, and syncing to a dashboard. However the SKILL.md and scripts expect an external VPS (rsync/ssh) for the Command Center; the registry metadata claimed no config paths or env vars but the instructions require ~/.claude/settings.json, a session.json, and an SSH-authorized key on a remote host. That mismatch (metadata vs. declared requirements) is a red flag.
Instruction Scope
SKILL.md and scripts instruct the agent to use a saved browser session (session.json with cookies and cf_clearance), click reveal phone buttons, intercept API responses, and then rsync the DB to [email protected] and run a remote sync script. scrape.py also reads LOOPNET_EMAIL / LOOPNET_PASS from the environment even though those env vars are not declared. The skill therefore reads/uses browser session data and credentials and transmits scraped data (including click-revealed phone numbers) to an external host — scope exceeds a simple local scraping helper.
Install Mechanism
No external install spec (instruction-only) so nothing is downloaded at install time — lower risk in that sense. But the package ships an included session.json (cookies/Cloudflare tokens), Python scripts, and shell scripts that will be present on disk and executed. The inclusion of a pre-populated session.json (with cf_clearance and many domain cookies) is unusual and risky because it embeds session state that can be used to bypass protections.
Credentials
Declared registry metadata lists no env/config requirements, but SKILL.md and code require/expect: ~/.claude/settings.json, a session.json, Chrome with Claude extension, an SSH key authorized on an external VPS, and LoopNet credentials via LOOPNET_EMAIL/LOOPNET_PASS. The skill also unsets ANTHROPIC_API_KEY in run-scrape.sh. Requesting an SSH key and syncing sensitive scraped data to an IP-owned remote host is disproportionate for a local scraper unless you explicitly control that remote host.
Persistence & Privilege
always:false (good), but the skill is designed to be run regularly (launchd cron entries suggested) and will autonomously push data to an external VPS when invoked. Autonomous invocation combined with automatic rsync/ssh to an unknown third party increases blast radius — the skill would repeatedly transmit scraped leads (including phone numbers and any intercepted session data) off-host.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install cre-scraper
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /cre-scraper 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v2.0.0
Full rewrite: Claude in Chrome approach bypasses Cloudflare, direct properties table, AI analysis, value-add scoring, Crexi + LoopNet support
元数据
Slug cre-scraper
版本 2.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

CRE Scraper 是什么?

Scrapes commercial real estate listings from Crexi and LoopNet using Claude in Chrome on a Mac Mini with residential IP. Bypasses Cloudflare bot protection.... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 96 次。

如何安装 CRE Scraper?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install cre-scraper」即可一键安装,无需额外配置。

CRE Scraper 是免费的吗?

是的,CRE Scraper 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

CRE Scraper 支持哪些平台?

CRE Scraper 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 CRE Scraper?

由 jaceagentic(@jaceagentic)开发并维护,当前版本 v2.0.0。

💬 留言讨论