← Back to Skills Marketplace
110
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install real-estate-spider
Description
专业爬取中国房产中介网站(安居客、搜房网、贝壳找房、链家)数据的通用爬虫技能,包含反爬虫策略和自动数据提取功能
Usage Guidance
This skill appears to do what it claims (crawler + anti‑bot workarounds), but it contains steps that handle and store sensitive session cookies, recommends using proxies and third‑party CAPTCHA‑solving services, and writes session/cookie files to disk. Before installing or running it: 1) Review the shell scripts and Python code line‑by‑line (especially any code that saves or loads session files, or performs network uploads). 2) Do not paste real authentication cookies or API keys into the scripts unless you understand the risks; prefer manual CAPTCHA handling. 3) Run initially in an isolated environment (VM/container) and with non‑privileged user account. 4) Replace example proxy endpoints and CAPTCHA service URLs with vetted providers only if necessary, and store API keys in a secure place (and add explicit env var support). 5) Verify legality and website terms of service for scraping target sites in your jurisdiction. If you are not comfortable auditing the code, treat this skill as high‑risk and avoid providing real credentials or cookies.
Capability Analysis
Type: OpenClaw Skill
Name: real-estate-spider
Version: 1.0.0
This bundle is a functional real estate data crawler designed for Chinese platforms such as Lianjia, Beike, and Anjuke. It utilizes standard Python scraping libraries (requests, BeautifulSoup) and the OpenClaw agent-browser tool to extract property details while managing anti-bot measures like User-Agent rotation, random delays, and session persistence. The code logic in main.py and the scripts (e.g., real_estate_crawler.py) is transparently aligned with the stated purpose, and no evidence of data exfiltration, credential theft, or unauthorized system access was found.
Capability Assessment
Purpose & Capability
Name/description, required binaries (python3, agent-browser), and included Python and shell scripts all align with a web‑crawling skill for Anjuke/Ke/Lianjia/Soufun. The presence of agent-browser scripts and Python crawlers is expected for the claimed functionality.
Instruction Scope
SKILL.md and included scripts explicitly instruct the agent to: set cookies (including names like lianjia_ssid), save and restore browser session files, simulate device fingerprints, rotate proxies, capture screenshots, and optionally send captcha images to third‑party CAPTCHA solving endpoints. Those actions go beyond simple data collection because they encourage reuse of authenticated sessions and transmission of potentially sensitive artifacts to external services.
Install Mechanism
No remote install/download URLs or package installers are used; the skill is distributed as source files and shell scripts (no install spec). That lowers supply‑chain risk compared to arbitrary remote downloads. The scripts do, however, call 'agent-browser' and rely on a local Python environment.
Credentials
The skill requests no declared environment variables or credentials, but its behavior relies on sensitive data: saved session files and cookies, optional CAPTCHA API keys, and proxy endpoints. The SKILL.md and config reference using/setting cookie values and an API key for captcha solving (example Authorization header), yet no env var is declared for that key—this mismatch and the fact that session/cookie files will be written to disk are proportionality and secrecy handling concerns.
Persistence & Privilege
The skill is not always-enabled and is user-invocable; it runs shell commands and spawns agent-browser subprocesses (main.py uses subprocess for scripts). That behavior is expected for this class of skill, but because it reads/writes session files and can execute shell scripts, run it with the same care you would give any code that manipulates cookies or spawns processes.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install real-estate-spider - After installation, invoke the skill by name or use
/real-estate-spider - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release of Real Estate Spider – a universal crawler for major Chinese real estate sites.
- Supports data extraction from Anjuke, Soufun, Beike (ke.com), and Lianjia.
- Includes anti-crawling strategies: browser fingerprint simulation, random delay, session & cookie management, optional proxy IP, and captcha handling.
- Extracts core real estate info: price, area, location, type, decoration, and year built.
- Allows export in JSON, CSV, Excel, and supports visualization.
- Provides usage scripts for both Python and agent-browser automation.
Metadata
Frequently Asked Questions
What is Real Estate Spider?
专业爬取中国房产中介网站(安居客、搜房网、贝壳找房、链家)数据的通用爬虫技能,包含反爬虫策略和自动数据提取功能. It is an AI Agent Skill for Claude Code / OpenClaw, with 110 downloads so far.
How do I install Real Estate Spider?
Run "/install real-estate-spider" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Real Estate Spider free?
Yes, Real Estate Spider is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does Real Estate Spider support?
Real Estate Spider is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Real Estate Spider?
It is built and maintained by h8296699 (@h8296699); the current version is v1.0.0.
More Skills