← Back to Skills Marketplace
140
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install real-estate-crawler
Description
综合房产中介网站爬虫技能,支持安居客、贝壳找房、链家、搜房网的数据抓取,包含反爬虫绕过策略和数据提取功能。
Usage Guidance
This skill appears to be what it claims — a set of crawlers and agent-browser scripts that intentionally attempt to bypass anti‑bot defenses. Before installing or running it, consider: 1) Legal/ToS risk — bypassing anti‑bot measures or scraping protected content can violate website terms or local law; only use for authorized/legal purposes. 2) Privacy/exfiltration risk — the code saves session cookies/screenshots and contains examples that would send CAPTCHA images to external APIs if you configure them; do not add third‑party API keys unless you trust the service. 3) Trust and isolation — run in an isolated environment (container/VM) and review/clean the proxy list and any configured endpoints. 4) Credentials — the tool does not require secrets by default, but if you add API keys or session cookies they grant access to site sessions; store them securely and minimally. 5) Audit before use — inspect any proxy URLs and any captcha‑solver endpoints you plan to use. If you need a safer alternative, prefer official APIs or licensed data providers rather than bypassing site protections.
Capability Analysis
Type: OpenClaw Skill
Name: real-estate-crawler
Version: 1.0.0
The skill bundle contains a critical security vulnerability in 'main.py', where user-provided command-line arguments are concatenated into shell strings and executed via 'subprocess.run(shell=True)', creating a high risk of Remote Code Execution (RCE) through shell injection. While the bundle provides extensive tools and documentation for bypassing anti-crawler mechanisms on Chinese real estate websites (e.g., 'scripts/bypass_ke.sh', 'docs/anti_crawler_guide.md'), its behavior appears aligned with its stated purpose of data collection. There is no evidence of intentional malice, such as credential theft or unauthorized data exfiltration, but the lack of input sanitization in the execution logic warrants a suspicious classification.
Capability Assessment
Purpose & Capability
Name/description match the delivered artifacts: Python crawlers, shell scripts, config, and docs all target Anjuke, Ke, Lianjia, Soufun. Declared binary requirements (python3, agent-browser) are appropriate for the implementation. No unrelated credentials, binaries, or install steps are requested.
Instruction Scope
SKILL.md and scripts explicitly instruct running agent-browser and python scripts to fetch pages, set headers/cookies, save sessions, take screenshots, and pause for manual CAPTCHA handling. These instructions are within the stated purpose but also describe active anti‑detection techniques (UA/device spoofing, cookie/session reuse, proxy rotation, CAPTCHA handling). The docs include an example of posting CAPTCHA images to a third‑party API — optional but could transmit site screenshots to external endpoints if enabled.
Install Mechanism
No install spec is present (instruction-only in registry), and included files are local scripts/code. There are no downloads from arbitrary URLs or extract/install steps in the package metadata. The only external tool dependency is agent-browser (expected) and standard Python packages.
Credentials
The skill declares no required environment variables or credentials. Config includes optional fields for a CAPTCHA API URL and api_key (both default to None) and a proxy list placeholder. These optional fields are sensible for the feature set but mean that if you populate them you may expose keys or send data to third‑party services — not required for basic operation.
Persistence & Privilege
The skill does not request always:true, does not change other skills, and uses normal file writes for session and output files (e.g., session JSON, screenshots, PDF). That behavior is expected for a crawler and confined to its own workspace/files.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install real-estate-crawler - After installation, invoke the skill by name or use
/real-estate-crawler - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Real Estate Crawler v1.0.0
- Initial release supporting data crawling from Anjuke, Beike (Ke), Lianjia, and Soufun real estate platforms.
- Implements anti-crawling bypass strategies: fingerprint spoofing, random delay, cookie/session management, proxies, and basic captcha handling.
- Provides multiple crawl modes: Python requests, agent-browser (headless browser automation), and hybrid.
- Extracts key property data including price, area, location, type, age, decoration, and images.
- Supports data export in JSON, CSV, Excel, HTML report, and visual charts.
- Includes detailed usage instructions, script examples, configuration options, and troubleshooting guides.
Metadata
Frequently Asked Questions
What is Real Estate Crawler?
综合房产中介网站爬虫技能,支持安居客、贝壳找房、链家、搜房网的数据抓取,包含反爬虫绕过策略和数据提取功能。 It is an AI Agent Skill for Claude Code / OpenClaw, with 140 downloads so far.
How do I install Real Estate Crawler?
Run "/install real-estate-crawler" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Real Estate Crawler free?
Yes, Real Estate Crawler is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does Real Estate Crawler support?
Real Estate Crawler is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Real Estate Crawler?
It is built and maintained by h8296699 (@h8296699); the current version is v1.0.0.
More Skills