← Back to Skills Marketplace
jaceagentic

CRE Scraper

by jaceagentic · GitHub ↗ · v2.0.0 · MIT-0
cross-platform ⚠ suspicious
96
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install cre-scraper
Description
Scrapes commercial real estate listings from Crexi and LoopNet using Claude in Chrome on a Mac Mini with residential IP. Bypasses Cloudflare bot protection....
README (SKILL.md)

CRE Scraper v2.0

Scrape commercial real estate listings from Crexi and LoopNet using Claude in Chrome.

Architecture

Mac Mini (residential IP + Chrome)
  → /scrape-crexi or /scrape-loopnet slash commands
  → ~/.openclaw/workspace/data/properties.db
  → rsync to VPS staging
  → sync-properties.py → Command Center dashboard

Requirements

  • macOS with Claude Code installed
  • Claude in Chrome browser extension active
  • Logged into Crexi (crexi.com) and LoopNet (loopnet.com) in Chrome
  • SSH key authorized on VPS
  • chromeEnabled: true in ~/.claude/settings.json

Usage

Run Crexi scrape (all 21 combinations):

~/.openclaw/skills/cre-scraper/run-scrape.sh

Run enrichment on unenriched properties:

~/.openclaw/skills/cre-scraper/enrich-batch.sh [batch_size]

Or inside Claude Code:

/scrape-crexi
/scrape-loopnet

Configuration

  • States: FL, GA, NC, TN, AL, LA, ID
  • Asset types: rv_park, self_storage, marina
  • Price range: $800K–$3M
  • Min units: 50+ (when known)
  • Value-add threshold: VAS ≥ 40

What gets scraped

Per listing:

  • Address, city, state, zip
  • Asking price, cap rate, NOI, occupancy
  • Units/pads/slips, SF, year built, acreage
  • Pro-forma cap rate and NOI
  • Broker name, firm, full phone (click-reveal)
  • Description and investment highlights
  • AI analysis: IRR, DSCR, Cash-on-Cash, Value-Add Score, AI Confidence

Cron schedule (launchd)

  • 7:00am — Crexi scrape (ai.crexi.scraper)
  • 8:00am — LoopNet scrape (ai.loopnet.scraper)
  • Midnight — Enrichment batch (ai.crexi.enricher)

Trigger phrases

  • "scrape new deals"
  • "run the Crexi scraper"
  • "find new RV parks in Florida"
  • "check LoopNet for self storage in Tennessee"
  • "enrich unenriched properties"
  • "sync deals to dashboard"

Output

Properties saved to ~/.openclaw/workspace/data/properties.db and synced to OpenClaw Command Center dashboard via sync-properties.py.

Usage Guidance
Do not install or run this skill until you confirm a few things. Key concerns: (1) The package includes a session.json with Cloudflare clearance and many cookies — that lets the scraper bypass bot protections and may include someone else's session or sensitive tokens; don't use a session file you don't fully trust. (2) Scripts rsync the local DB to [email protected] and SSH to run a remote script — that sends scraped data (and could send session-derived data) to an unknown server. If you intend to use this, replace the remote host with a server you control, remove or regenerate the included session.json, and only grant SSH access to a dedicated key/account. (3) The code expects LOOPNET_EMAIL / LOOPNET_PASS env vars but they are not declared in registry metadata — supplying credentials to undeclared code is risky. (4) Check legal/ToS implications of scraping these sites. If you want help making the skill safer: remove the bundled session.json, add clear declarations for required env vars and remote hosts, and change rsync/ssh endpoints to your own infrastructure. If you cannot verify the owner or purpose of the remote VPS, treat this skill as unsafe.
Capability Analysis
Type: OpenClaw Skill Name: cre-scraper Version: 2.0.0 The skill bundle contains scripts (run-scrape.sh, enrich-batch.sh) that automatically sync a local SQLite database containing scraped real estate data and broker contacts to a hardcoded external IP address (187.77.140.113) via rsync and execute remote commands via SSH as the root user. It also includes a session.json file containing active session cookies and internal configuration data for Crexi, which could lead to session hijacking if the bundle is shared. While these behaviors are consistent with the stated purpose of syncing to a 'Command Center' dashboard, the use of a hardcoded Brazilian IP and root SSH access are high-risk patterns that facilitate unauthorized data exfiltration.
Capability Assessment
Purpose & Capability
Name/description (CRE scraping) matches the code: Playwright-based scrapers for Crexi and LoopNet, local SQLite storage, and syncing to a dashboard. However the SKILL.md and scripts expect an external VPS (rsync/ssh) for the Command Center; the registry metadata claimed no config paths or env vars but the instructions require ~/.claude/settings.json, a session.json, and an SSH-authorized key on a remote host. That mismatch (metadata vs. declared requirements) is a red flag.
Instruction Scope
SKILL.md and scripts instruct the agent to use a saved browser session (session.json with cookies and cf_clearance), click reveal phone buttons, intercept API responses, and then rsync the DB to [email protected] and run a remote sync script. scrape.py also reads LOOPNET_EMAIL / LOOPNET_PASS from the environment even though those env vars are not declared. The skill therefore reads/uses browser session data and credentials and transmits scraped data (including click-revealed phone numbers) to an external host — scope exceeds a simple local scraping helper.
Install Mechanism
No external install spec (instruction-only) so nothing is downloaded at install time — lower risk in that sense. But the package ships an included session.json (cookies/Cloudflare tokens), Python scripts, and shell scripts that will be present on disk and executed. The inclusion of a pre-populated session.json (with cf_clearance and many domain cookies) is unusual and risky because it embeds session state that can be used to bypass protections.
Credentials
Declared registry metadata lists no env/config requirements, but SKILL.md and code require/expect: ~/.claude/settings.json, a session.json, Chrome with Claude extension, an SSH key authorized on an external VPS, and LoopNet credentials via LOOPNET_EMAIL/LOOPNET_PASS. The skill also unsets ANTHROPIC_API_KEY in run-scrape.sh. Requesting an SSH key and syncing sensitive scraped data to an IP-owned remote host is disproportionate for a local scraper unless you explicitly control that remote host.
Persistence & Privilege
always:false (good), but the skill is designed to be run regularly (launchd cron entries suggested) and will autonomously push data to an external VPS when invoked. Autonomous invocation combined with automatic rsync/ssh to an unknown third party increases blast radius — the skill would repeatedly transmit scraped leads (including phone numbers and any intercepted session data) off-host.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install cre-scraper
  3. After installation, invoke the skill by name or use /cre-scraper
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v2.0.0
Full rewrite: Claude in Chrome approach bypasses Cloudflare, direct properties table, AI analysis, value-add scoring, Crexi + LoopNet support
Metadata
Slug cre-scraper
Version 2.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is CRE Scraper?

Scrapes commercial real estate listings from Crexi and LoopNet using Claude in Chrome on a Mac Mini with residential IP. Bypasses Cloudflare bot protection.... It is an AI Agent Skill for Claude Code / OpenClaw, with 96 downloads so far.

How do I install CRE Scraper?

Run "/install cre-scraper" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is CRE Scraper free?

Yes, CRE Scraper is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does CRE Scraper support?

CRE Scraper is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created CRE Scraper?

It is built and maintained by jaceagentic (@jaceagentic); the current version is v2.0.0.

💬 Comments