← Back to Skills Marketplace
eddiexux

Beike Xiaoqu Research

by eddiexux · GitHub ↗ · v2.2.0 · MIT-0
cross-platform ⚠ suspicious
139
Downloads
0
Stars
0
Active Installs
4
Versions
Install in OpenClaw
/install beike-xiaoqu-research
Description
通过 mcp-chrome 插件抓取贝壳找房数据,支持两种模式:(1) 单小区深度四步研究(详情/在售/成交);(2) 按地区批量发现并筛选符合条件的小区,再对候选清单做 PAL MCP 多模型 consensus 综合评估。适用场景:查贝壳小区信息、按区域发现候选小区、多模型评估买房方案。需要用户已安装 mcp...
README (SKILL.md)

贝壳找房小区研究工具 v2.2

I/O 契约(输入 / 输出 / 错误)

Agent 调用此 Skill 前必读,确保参数与返回结构符合预期。

模式 A:单小区深度研究

输入参数(环境变量 / 脚本参数)

参数 类型 默认值 说明
$1 (NAME) string 必填 小区名(中文),用于 rs{小区名} 搜索
$2 (CITY) string sh 城市前缀
$3 (OUTDIR) string /tmp/beike 输出目录
BEIKE_HEADLESS env 0 设为 1 则 headless 模式,验证码时 exit 4 不挂起

成功输出(OUTDIR 下生成以下文件)

{
  "status": "ok",
  "mode": "single_xiaoqu",
  "xiaoqu_name": "东方花园三期",
  "xiaoqu_id": "5011102207315",
  "files": {
    "xiaoqu_json": "{OUTDIR}/东方花园三期_xiaoqu.json",
    "ershou_json": "{OUTDIR}/东方花园三期_ershou.json",
    "chengjiao_json": "{OUTDIR}/东方花园三期_chengjiao.json"
  }
}

*_xiaoqu.json 字段 Schema:

{
  "avg_price": 78000,
  "building_type": "板楼",
  "total_units": "320户",
  "total_buildings": "8栋",
  "green_rate": "35%",
  "far": "1.8",
  "ownership": "商品房",
  "built_year": "2008-2012",
  "mgmt_fee": "2.5元/月/㎡",
  "mgmt_company": "绿城物业",
  "developer": "绿城集团",
  "followers": "125",
  "on_sale": "5",
  "sold_90d": "2",
  "views_30d": "38",
  "metros": ["🚇9号线 七宝站 688m"],
  "_parse_source": "css+innerText"
}

模式 B:区域批量发现

输入参数

参数 类型 默认值 说明
$1 (BOARDS) string 必填 逗号分隔板块,如 qibao,gumei
$2 (CITY) string sh 城市前缀
$3 (OUTDIR) string /tmp/beike_discover 输出目录
--district flag minhang 区域路径段,如 pudong
--no-detail flag off 仅发现,不抓详情
--consensus flag off 发现后接 PAL MCP 评估
BEIKE_HEADLESS env 0 headless 模式

成功输出

{
  "status": "ok",
  "mode": "region_discover",
  "city": "sh",
  "district": "minhang",
  "boards": ["qibao", "gumei"],
  "total_candidates": 8,
  "candidates": [
    {
      "name": "好世鹿鸣苑",
      "board": "qibao",
      "price": 68000,
      "year": "2010-2012",
      "on_sale": 5,
      "sold_90d": 2,
      "metro": "1号线 莘庄站 688m",
      "xiaoqu_id": "5011000015858"
    }
  ],
  "csv_file": "{OUTDIR}/candidates_2026-03-23.csv"
}

错误状态码

exit code status 字符串 说明
0 ok 成功
1 dependency_missing mcp-chrome 未启动 / 依赖缺失
2 captcha_detected 验证码(交互模式,等待用户处理)
3 parse_failed 关键字段解析失败(贝壳结构变更)
4 captcha_headless headless 模式遇验证码,无法处理

两种工作模式

模式 触发场景 脚本
A. 单小区深度研究 "查XX小区的信息"、"看东方花园三期" fetch_xiaoqu.sh
B. 区域批量发现 "找七宝/古美附近符合条件的小区"、"帮我发现候选小区" region_discover.sh

两种模式最终都可接 PAL MCP Consensus 评估consensus_analyze.py)。


前置检查

1. mcp-chrome 连接

curl -s http://127.0.0.1:12306/mcp -X POST \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"agent","version":"1.0"}}}' \
  | grep -o '"name":"ChromeMcpServer"' && echo "✅ 已连接" || echo "❌ 请检查插件"

获取 SESSION_ID(动态获取,不要写死):

SESSION_ID=$(curl -s -i -X POST "http://127.0.0.1:12306/mcp" \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"agent","version":"1.0"}}}' \
  | grep -i "mcp-session-id:" | awk '{print $2}' | tr -d '\r\
 ')

2. PAL MCP 可用性(Consensus 模式需要)

mcporter list pal 2>&1 | grep -q "chat" && echo "✅ PAL MCP 正常" || echo "❌ 请检查 ~/.mcporter/mcporter.json"

贝壳 URL 规则

# 城市前缀:sh=上海 bj=北京 sz=深圳 gz=广州

# 区域列表(发现小区)
按板块:   https://{city}.ke.com/xiaoqu/minhang/{板块拼音}/
全区:     https://{city}.ke.com/xiaoqu/{区拼音}/

# 常用上海板块
七宝=qibao  古美=gumei  金汇=jinhui  龙柏=longbai
莘庄=xinzhuang  漕河泾=caoheqing  虹桥=hongqiao

# 单小区操作
详情页:   https://{city}.ke.com/xiaoqu/{小区ID}/
在售:     https://{city}.ke.com/ershoufang/rs{小区名}/
成交:     https://{city}.ke.com/chengjiao/rs{小区名}/

模式 A:单小区深度研究(四步)

Step 1 – 查小区 ID

navigate → ershoufang/rs{小区名}/
wait 7s
JS: return Array.from(document.querySelectorAll('a'))
      .find(a => a.href.includes('/xiaoqu/') && /\/\d+\//.test(a.href))?.href || ''
# 取 URL 中的数字部分即为小区 ID

Step 2 – 小区详情页

navigate → xiaoqu/{小区ID}/
wait 7s
JS: return document.body.innerText
→ parse_beike.py xiaoqu

关键解析注意

  • 贝壳使用 \xa0(非断行空格),正则需用 [\s\xa0]* 代替普通空格
  • 字段格式:字段名\ 值(换行分隔,非同行)
  • 在售套数:格式为 N套\ 在售二手房

Step 3 – 在售二手房

navigate → ershoufang/rs{小区名}/
wait 7s
JS: return document.body.innerText
→ parse_beike.py ershou

Step 4 – 成交记录

navigate → chengjiao/rs{小区名}/
wait 7s  # 此页最易触发验证码,建议在 Step 2/3 之间 sleep 10s
JS: return document.body.innerText
→ parse_beike.py chengjiao

模式 B:区域批量发现

B1 – 读取板块小区列表

navigate → https://sh.ke.com/xiaoqu/minhang/{板块拼音}/
wait 7s
JS: return document.body.innerText
→ parse_beike.py region_list  # 输出符合条件的小区列表 JSON

列表页格式(已验证):

小区名                     ← 行 i-1
 90天成交X套               ← 行 i(标志行)
 闵行\xa0板块\xa0 /\xa0年份 ← 行 i+1
近地铁XX站(可选)
均价元/m2
月份参考均价
N套
在售二手房

筛选条件(可通过参数调整):

  • 建成年份最新年 ≥ 2005(次新)
  • 均价 40,000–110,000 元/㎡
  • 在售套数 ≥ 2
  • 排除:别墅、公寓、大厦、写字楼

B2 – 批量抓取详情

对筛选出的候选小区,循环执行步骤 A1–A2(详情页),每个小区间 sleep 10s 降低验证码风险。

一键脚本:

bash scripts/region_discover.sh qibao,gumei,jinhui sh /tmp/result/
# 参数: 板块列表(逗号分隔)  城市  输出目录

验证码处理协议

检测:grep -q "请在下图\|请按语序" page.txt

触发后:
1. 立即停止 → 告知用户 "⚠️ Chrome 有验证码,请手动完成点选"
2. 等待用户确认
3. 直接重读当前页(不重新 navigate)
4. 验证通过后继续

PAL MCP Consensus 评估(可选最终步骤)

收集好各小区 JSON 数据后,调用 consensus_analyze.py 让多个 AI 模型从不同维度评分:

python3 scripts/consensus_analyze.py \
  --data-dir /tmp/result/ \
  --requirements "三房120㎡以上, 预算1300万以内, 有产权车位, 地铁1km以内, 2005年后" \
  --models "gemini-3-pro-preview,auto" \
  --output /tmp/result/consensus_report.md

Consensus 流程

  1. 汇总所有小区数据为结构化摘要
  2. 构造统一评估 prompt(含需求条件)
  3. 模型1(偏保守/价格导向)评分 + 理由
  4. 模型2(偏流动性/投资导向)评分 + 理由
  5. 综合两个视角输出最终排名推荐

分析维度参考

维度 关注点 权重
流动性 90天成交套数 + 成交周期 ★★★★★
价格匹配 120㎡三房总价是否在预算内 ★★★★★
车位 成交记录中车位条数;在售"产权车位"提及 ★★★★☆
地铁距离 最近站 \x3C 1km 优先 ★★★★☆
建筑年代 2008+ 为次新,板楼优于塔楼 ★★★☆☆
容积率 \x3C 2.0 为低密度 ★★★☆☆
物业品质 绿城/仁恒/万科 > 普通 ★★☆☆☆
学区 若有孩子计划,需额外查证 ★★★☆☆
Usage Guidance
Before installing or running this skill, check the following: 1) Dependencies: ensure curl, python3 (and optionally jq) and mcporter are installed and that you have the mcp-chrome plugin running on 127.0.0.1:12306. The registry metadata omitted these requirements — the scripts will fail otherwise. 2) mcporter / PAL endpoints: consensus mode calls mcporter -> pal.chat. Inspect ~/.mcporter/mcporter.json to see which remote model servers will receive your scraped data; only enable consensus if those endpoints are trusted. 3) Browser privacy: the skill reads the active Chrome tab(s) via the plugin and thus can access data visible in your logged-in session (including personal pages). Do not run while sensitive sessions are open, or run in a dedicated profile/browser instance. 4) Captcha & headless: BEIKE_HEADLESS env var controls behavior; headless mode will exit on captchas. The scripts pause for manual captcha resolution in interactive mode. 5) Metadata mismatch: the package declares no required env/binaries but scripts require several tools — treat that as a sign to manually review scripts before running. 6) Least privilege: avoid allowing unattended/autonomous runs that use consensus mode on full scraped dumps; consider running the discovery and review the candidate JSONs locally first, then manually invoke consensus if desired. If you are not comfortable with these data flows, do not install or run the skill until you can audit the mcporter config and run it in an isolated environment.
Capability Analysis
Type: OpenClaw Skill Name: beike-xiaoqu-research Version: 2.2.0 The skill bundle is a legitimate tool for scraping and analyzing real estate data from Beike (ke.com). It uses the mcp-chrome plugin for browser automation and the pal-mcp-server (via mcporter) for multi-model AI evaluation of the gathered data. The scripts (fetch_xiaoqu.sh, region_discover.sh, parse_beike.py, and consensus_analyze.py) are well-structured, documented, and their logic is strictly confined to the stated purpose of property research, including robust handling for captchas and parsing errors. No evidence of data exfiltration, malicious execution, or prompt injection was found.
Capability Assessment
Purpose & Capability
The skill's name/description matches its behavior: it scrapes 贝壳 (ke.com) via a local mcp-chrome plugin and optionally sends data to PAL MCP for multi-model evaluation. However the registry metadata claims no required binaries/env but the scripts clearly require curl, python3, (optionally jq), the local mcp-chrome plugin (127.0.0.1:12306) and mcporter (with ~/.mcporter/mcporter.json). That mismatch between declared requirements and actual files is an incoherence that could surprise users.
Instruction Scope
SKILL.md and the included scripts instruct the agent to: connect to a local MCP server, create/switch browser tabs, navigate to ke.com pages, execute JavaScript to read page innerText and CSS-selected fields, save outputs to /tmp, parse and aggregate them, and (optionally) call mcporter/pal.chat. All of those steps are within the stated purpose (web scraping + evaluation). Important: the skill reads whatever is visible in the browser tab (including any logged-in session content, cookies implicitly used by the browser, screenshots), and it includes an explicit protocol for handling captchas (pauses for manual intervention).
Install Mechanism
Instruction-only skill with shipped scripts (no remote download/install spec). Nothing in the package attempts to fetch or execute code from arbitrary remote URLs during install. This is lower risk compared to skills that download archives at install time.
Credentials
The skill does not declare required env vars or binaries in its registry metadata, but the scripts use BEIKE_HEADLESS, expect mcporter configuration (~/.mcporter/mcporter.json), and call local endpoints (http://127.0.0.1:12306). It doesn't ask for unrelated cloud credentials, which is good, but it WILL read browser-page content and then (when consensus mode is enabled) send scraped candidate data to whatever PAL MCP endpoints the user's mcporter is configured to use. That data path to remote LLM providers is important to verify before use.
Persistence & Privilege
always:false and no modification of other skills detected. The scripts create/use a browser tab (they try to create an isolated tab) and write files under /tmp — expected for a scraper. However because the skill can be invoked autonomously by the agent (disable-model-invocation is false by default), it could read pages from the user's live browser session without further prompts unless policy UI prevents that; combined with the ability to forward data to configured PAL servers, this increases blast radius if invoked without review.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install beike-xiaoqu-research
  3. After installation, invoke the skill by name or use /beike-xiaoqu-research
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v2.2.0
v2.2 P1修复: 1)BEIKE_HEADLESS=1支持headless运行(exit 4); 2)CSS Selector优先提取+innerText兜底(css_merge模式); 3)SKILL.md新增I/O Schema契约章节(输入参数表/输出JSON/错误码枚举)
v2.1.0
v2.1 P0修复: 1)去除minhang硬编码,支持--district参数; 2)解析失败显式报错(exit 3+stderr警告); 3)创建专用隔离Tab防劫持; 4)验证码循环检测最多3次
v2.0.0
v2: 新增区域批量发现模式(region_discover.sh)、region_list解析器、PAL MCP多模型consensus评估(consensus_analyze.py)
v1.0.0
初始版本:三步流程抓取小区详情/在售房源/成交记录,含验证码处理协议、parse_beike.py 解析器、fetch_xiaoqu.sh 一键脚本
Metadata
Slug beike-xiaoqu-research
Version 2.2.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 4
Frequently Asked Questions

What is Beike Xiaoqu Research?

通过 mcp-chrome 插件抓取贝壳找房数据,支持两种模式:(1) 单小区深度四步研究(详情/在售/成交);(2) 按地区批量发现并筛选符合条件的小区,再对候选清单做 PAL MCP 多模型 consensus 综合评估。适用场景:查贝壳小区信息、按区域发现候选小区、多模型评估买房方案。需要用户已安装 mcp... It is an AI Agent Skill for Claude Code / OpenClaw, with 139 downloads so far.

How do I install Beike Xiaoqu Research?

Run "/install beike-xiaoqu-research" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Beike Xiaoqu Research free?

Yes, Beike Xiaoqu Research is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Beike Xiaoqu Research support?

Beike Xiaoqu Research is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Beike Xiaoqu Research?

It is built and maintained by eddiexux (@eddiexux); the current version is v2.2.0.

💬 Comments