← 返回 Skills 市场

xueqiu-collector

Name: xueqiu-collector
Author: zhangjia-ie

作者 zhangjia-ie · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ⚠ suspicious

125

总下载

当前安装

版本数

在 OpenClaw 中安装

/install xueqiu-collector

功能描述

雪球帖子全量采集 Skill。采集任意雪球用户的全部帖子（含完整正文、图片下载、OCR识别），自动做 V4 规则分析（帖子类型/投资相关性/情感/操作意图/主题标签/质量评分），结果存入 SQLite 数据库并导出 JSON + Markdown 备份。触发词：采集雪球、雪球帖子采集、爬取雪球、收集雪球、雪...

安全使用建议

Before installing or running this skill: - Understand it needs access to a real Edge browser profile (login state) to work reliably. That profile contains all browser cookies and sessions—prefer using a dedicated Edge profile created only for scraping rather than your primary browser profile. - The skill will run npx/playwright-cli and drive Edge; ensure you trust the machine and review the commands you will run. Playwright may download browser binaries if missing. - The package writes logs, images and a SQLite DB to local disk (data/ and logs/ under the skill). Review those files for sensitive content and consider where you store/back them up. - Confirm scraping Xueqiu is permitted under the site's terms and that you have the right to collect the targeted users' posts. - Note the registry metadata omits the Edge profile/config requirement—this mismatch is likely an oversight but worth verifying with the publisher. - If you are concerned about exposure, run the skill in a sandboxed VM or create a throwaway Edge profile (logged-in only to the specific Xueqiu account) and inspect the code (collect.py/check_env.py/analyze.py) before use. If you need higher assurance, request the publisher to declare required config paths and explain why full profile access is necessary.

功能分析

Type: OpenClaw Skill Name: xueqiu-collector Version: 1.0.0 The skill is a functional Xueqiu scraper that requires high-privilege access to the user's Edge browser profile (including session cookies) to bypass anti-bot measures. While this behavior is aligned with the stated purpose, the script `collect.py` lacks input sanitization for the `author` parameter, which is used to construct file paths, creating a potential path traversal vulnerability during data export. Additionally, the tool relies on executing shell commands via `subprocess` and `npx`, which increases the risk if the AI agent is manipulated into using malicious arguments.

能力评估

ℹ Purpose & Capability

Name/description claim to scrape Xueqiu posts and run local rule-based analysis; the scripts implement exactly that using playwright-cli, Edge profile, and local SQLite/JSON output. That capability set is coherent with the stated purpose. Minor mismatch: registry metadata lists no required config paths or credentials, but the tool clearly expects an Edge profile (login state) and npx/playwright available.

ℹ Instruction Scope

SKILL.md and scripts instruct running check_env.py, collect.py and analyze.py which will: drive Edge via playwright-cli, save snapshots, download images, run OCR, write logs, and persist data to SQLite/JSON/Markdown. All of this is within the stated scraping/analysis scope. The instructions explicitly require mounting a real Edge profile (to reuse login state), which lets the tool access cookies and other profile data beyond just Xueqiu session—this is functional for bypassing captchas but increases privacy risk.

✓ Install Mechanism

There is no automated install spec — this is an instruction+script bundle. It relies on existing npx/playwright-cli and local Edge; no obscure external downloads or URL-based installers appear in the package. Running npx/playwright may cause local browser installation via Playwright, but that is standard and traceable.

⚠ Credentials

Metadata declares no required env vars or config paths, yet scripts actively probe environment variables and multiple user directories to locate npx and Edge profile, and expect a path to an Edge profile folder (which contains cookies, local storage, etc.). Access to a full browser profile is sensitive and broader than 'just Xueqiu credentials'. The skill will also write logs and a DB under the skill's data/logs directories. The lack of declared required config paths in registry metadata is a notable omission.

✓ Persistence & Privilege

The skill does not request 'always: true' or other elevated installation privileges. It stores output (DB/JSON/MD/images) and logs under the project/data and project/logs directories, which is expected for a scraper. It does not modify other skills or system-wide agent settings.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install xueqiu-collector
安装完成后，直接呼叫该 Skill 的名称或使用 /xueqiu-collector 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

xueqiu-collector v1.0.0 初始发布 - 支持采集任意雪球用户的全部帖子（含完整正文、图片下载、图片OCR识别） - 依据 V4 规则自动分析帖子类型、投资相关性、情感、操作意图、主题标签与质量评分 - 采集结果存入 SQLite 数据库并支持导出为 JSON 与 Markdown 格式（全量及分类） - 提供全量/增量采集、补全文本、批量分析等标准操作流程 - 内置反爬虫措施（请求延迟、重试、断点续采），日志记录与环境检查 - 支持通过 Edge 浏览器真实用户登录态规避验证码 - 附带详细参数说明、路径配置、输出结构及常见采坑经验

元数据

Slug xueqiu-collector

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

xueqiu-collector 是什么？

雪球帖子全量采集 Skill。采集任意雪球用户的全部帖子（含完整正文、图片下载、OCR识别），自动做 V4 规则分析（帖子类型/投资相关性/情感/操作意图/主题标签/质量评分），结果存入 SQLite 数据库并导出 JSON + Markdown 备份。触发词：采集雪球、雪球帖子采集、爬取雪球、收集雪球、雪... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 125 次。

如何安装 xueqiu-collector？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install xueqiu-collector」即可一键安装，无需额外配置。

xueqiu-collector 是免费的吗？

是的，xueqiu-collector 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

xueqiu-collector 支持哪些平台？

xueqiu-collector 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 xueqiu-collector？

由 zhangjia-ie（@zhangjia-ie）开发并维护，当前版本 v1.0.0。