← Back to Skills Marketplace

xueqiu-collector

Name: xueqiu-collector
Author: zhangjia-ie

by zhangjia-ie · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ⚠ suspicious

125

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install xueqiu-collector

Description

雪球帖子全量采集 Skill。采集任意雪球用户的全部帖子（含完整正文、图片下载、OCR识别），自动做 V4 规则分析（帖子类型/投资相关性/情感/操作意图/主题标签/质量评分），结果存入 SQLite 数据库并导出 JSON + Markdown 备份。触发词：采集雪球、雪球帖子采集、爬取雪球、收集雪球、雪...

Usage Guidance

Before installing or running this skill: - Understand it needs access to a real Edge browser profile (login state) to work reliably. That profile contains all browser cookies and sessions—prefer using a dedicated Edge profile created only for scraping rather than your primary browser profile. - The skill will run npx/playwright-cli and drive Edge; ensure you trust the machine and review the commands you will run. Playwright may download browser binaries if missing. - The package writes logs, images and a SQLite DB to local disk (data/ and logs/ under the skill). Review those files for sensitive content and consider where you store/back them up. - Confirm scraping Xueqiu is permitted under the site's terms and that you have the right to collect the targeted users' posts. - Note the registry metadata omits the Edge profile/config requirement—this mismatch is likely an oversight but worth verifying with the publisher. - If you are concerned about exposure, run the skill in a sandboxed VM or create a throwaway Edge profile (logged-in only to the specific Xueqiu account) and inspect the code (collect.py/check_env.py/analyze.py) before use. If you need higher assurance, request the publisher to declare required config paths and explain why full profile access is necessary.

Capability Analysis

Type: OpenClaw Skill Name: xueqiu-collector Version: 1.0.0 The skill is a functional Xueqiu scraper that requires high-privilege access to the user's Edge browser profile (including session cookies) to bypass anti-bot measures. While this behavior is aligned with the stated purpose, the script `collect.py` lacks input sanitization for the `author` parameter, which is used to construct file paths, creating a potential path traversal vulnerability during data export. Additionally, the tool relies on executing shell commands via `subprocess` and `npx`, which increases the risk if the AI agent is manipulated into using malicious arguments.

Capability Assessment

ℹ Purpose & Capability

Name/description claim to scrape Xueqiu posts and run local rule-based analysis; the scripts implement exactly that using playwright-cli, Edge profile, and local SQLite/JSON output. That capability set is coherent with the stated purpose. Minor mismatch: registry metadata lists no required config paths or credentials, but the tool clearly expects an Edge profile (login state) and npx/playwright available.

ℹ Instruction Scope

SKILL.md and scripts instruct running check_env.py, collect.py and analyze.py which will: drive Edge via playwright-cli, save snapshots, download images, run OCR, write logs, and persist data to SQLite/JSON/Markdown. All of this is within the stated scraping/analysis scope. The instructions explicitly require mounting a real Edge profile (to reuse login state), which lets the tool access cookies and other profile data beyond just Xueqiu session—this is functional for bypassing captchas but increases privacy risk.

✓ Install Mechanism

There is no automated install spec — this is an instruction+script bundle. It relies on existing npx/playwright-cli and local Edge; no obscure external downloads or URL-based installers appear in the package. Running npx/playwright may cause local browser installation via Playwright, but that is standard and traceable.

⚠ Credentials

Metadata declares no required env vars or config paths, yet scripts actively probe environment variables and multiple user directories to locate npx and Edge profile, and expect a path to an Edge profile folder (which contains cookies, local storage, etc.). Access to a full browser profile is sensitive and broader than 'just Xueqiu credentials'. The skill will also write logs and a DB under the skill's data/logs directories. The lack of declared required config paths in registry metadata is a notable omission.

✓ Persistence & Privilege

The skill does not request 'always: true' or other elevated installation privileges. It stores output (DB/JSON/MD/images) and logs under the project/data and project/logs directories, which is expected for a scraper. It does not modify other skills or system-wide agent settings.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install xueqiu-collector
After installation, invoke the skill by name or use /xueqiu-collector
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

xueqiu-collector v1.0.0 初始发布 - 支持采集任意雪球用户的全部帖子（含完整正文、图片下载、图片OCR识别） - 依据 V4 规则自动分析帖子类型、投资相关性、情感、操作意图、主题标签与质量评分 - 采集结果存入 SQLite 数据库并支持导出为 JSON 与 Markdown 格式（全量及分类） - 提供全量/增量采集、补全文本、批量分析等标准操作流程 - 内置反爬虫措施（请求延迟、重试、断点续采），日志记录与环境检查 - 支持通过 Edge 浏览器真实用户登录态规避验证码 - 附带详细参数说明、路径配置、输出结构及常见采坑经验

Metadata

Slug xueqiu-collector

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is xueqiu-collector?

雪球帖子全量采集 Skill。采集任意雪球用户的全部帖子（含完整正文、图片下载、OCR识别），自动做 V4 规则分析（帖子类型/投资相关性/情感/操作意图/主题标签/质量评分），结果存入 SQLite 数据库并导出 JSON + Markdown 备份。触发词：采集雪球、雪球帖子采集、爬取雪球、收集雪球、雪... It is an AI Agent Skill for Claude Code / OpenClaw, with 125 downloads so far.

How do I install xueqiu-collector?

Run "/install xueqiu-collector" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is xueqiu-collector free?

Yes, xueqiu-collector is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does xueqiu-collector support?

xueqiu-collector is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created xueqiu-collector?

It is built and maintained by zhangjia-ie (@zhangjia-ie); the current version is v1.0.0.

More Skills