功能描述

Collect recent high-interaction Xiaohongshu notes and cleaned comments using demand-style keywords for small-scale user need discovery and analysis.

使用说明 (SKILL.md)

Xiaohongshu Skill

Name: Xiaohongshu Demand Discovery
Author: zev55555

This is a Python Playwright based OpenClaw skill for Xiaohongshu/Rednote public-content workflows. It reads page data from Xiaohongshu web pages, mainly through window.__INITIAL_STATE__, and returns structured JSON.

It includes the original capabilities:

QR-code login and login status checking
Keyword search with sort/type/time filters
Note detail extraction
Comment loading for note detail pages
User profile extraction
Explore feed extraction
Optional publishing and interaction commands retained from the upstream skill

It also adds:

Xiaohongshu Demand Discovery Collector

The demand discovery mode searches demand-style keywords, collects recent high-interaction notes and comments, cleans the data, removes raw user identity fields, and writes structured files for later LLM demand analysis or product-manager agent workflows.

What This Skill Is

This skill is a Xiaohongshu content and comment collection tool. The demand discovery collector can:

Search recent notes using demand-oriented keywords such as 求推荐, 避雷, 平替, 真实测评, 后悔买, 踩坑
Filter search results to notes from the last week
Sort by most comments by default
Visit note detail pages
Load comments
Prefer notes published within the recent --days window
Preserve notes whose publish time cannot be parsed as publish_time_unknown
Clean comments by removing empty, duplicate, short-noise, and obvious advertising comments
Save structured output files for downstream analysis

What This Skill Is Not

This skill is not:

A tool for bypassing Xiaohongshu login, captcha, rate limits, or risk controls
An automatic like/comment/follow/publish bot for demand discovery
An LLM demand analysis tool
A product-manager agent
A complete SaaS product
An unrestricted high-volume crawler

Use it for learning, research, internal product validation, and small-scale public-content collection.

When OpenClaw Should Use It

Prefer demand-discovery when the user says things like:

“帮我跑一次小红书需求发现”
“抓小红书最近几天高互动笔记和评论”
“用小红书评论区挖用户需求”
“分析小红书用户在求推荐、避雷、平替里的需求”
“采集小红书热门笔记评论，后面喂给 LLM 分析”

Use search when the user only wants to search a Xiaohongshu keyword:

python -m scripts search "关键词"

Use feed when the user provides a concrete note id/link context and wants note detail or comments:

python -m scripts feed \x3Cfeed_id> \x3Cxsec_token> --load-comments --max-comments=20

Use qrcode or check-login when login state is unknown or expired.

Do not use interaction or publishing commands for demand discovery. The collector itself does not call comment.py, interact.py, or publish.py.

Installation

Run all commands from the skill root directory.

pip install -r requirements.txt
playwright install chromium

On Linux/WSL, Chromium dependencies may also be required:

playwright install-deps chromium

Login

First login by QR code:

python -m scripts qrcode --headless=false

Check login status:

python -m scripts check-login

If cookies expire, run the QR-code login again.

Command Reference

Search

python -m scripts search "美食" --sort-by=最新 --note-type=图文 --publish-time=一周内 --limit=10

Common search options:

--sort-by: 综合, 最新, 最多点赞, 最多评论, 最多收藏
--note-type: 不限, 视频, 图文
--publish-time: 不限, 一天内, 一周内, 半年内
--search-scope: 不限, 已看过, 未看过, 已关注
--location: 不限, 同城, 附近
--limit: returned result limit

Feed Detail

python -m scripts feed \x3Cfeed_id> \x3Cxsec_token>
python -m scripts feed \x3Cfeed_id> \x3Cxsec_token> --load-comments --max-comments=20

Explore Feed

python -m scripts explore --limit=20

The explore feed exists, but demand discovery should prefer keyword search in the first version.

User Profile

python -m scripts user \x3Cuser_id> [xsec_token]
python -m scripts me

Demand Discovery Collector

Basic command:

python -m scripts demand-discovery

Small-scale test:

python -m scripts demand-discovery --keywords "求推荐" --posts-per-keyword 1 --search-limit 3 --max-comments 5 --headless=false

Specify multiple keywords:

python -m scripts demand-discovery --keywords "求推荐,避雷,平替" --posts-per-keyword 2 --search-limit 5 --max-comments 10 --headless=false

Use a keyword file:

python -m scripts demand-discovery --keywords-file keywords.txt

Important parameters:

--keywords: comma-separated keywords
--keywords-file: UTF-8 text file, one keyword per line
--days: recent-day window for note filtering, default 3
--search-publish-time: Xiaohongshu search time filter, default 一周内
--sort-by: default 最多评论, also supports 最多点赞
--note-type: default 不限
--posts-per-keyword: notes saved per keyword, default 3
--search-limit: search results inspected per keyword, default 8
--max-comments: valid comments saved per note, default 20
--output-dir: output directory; default data/demand_discovery/\x3Ctimestamp>/
--timezone: default Asia/Shanghai
--headless: true or false

Default demand keywords:

求推荐
避雷
平替
真实测评
后悔买
踩坑
好用吗
怎么选
值不值得买
学生党
新手必备
替代品
不好用
怎么解决

Output files:

notes_clean.jsonl: one note-level record per saved/attempted note
comments_clean.jsonl: cleaned comment-level records
collection_summary.json: machine-readable summary and counters
collector_report.md: human-readable report

The collector uses one browser session per run:

XiaohongshuClient.start()
LoginAction.check_login_status()
Reused SearchAction
Reused FeedDetailAction
XiaohongshuClient.close()

Privacy And Data Safety

Demand discovery output must not save raw Xiaohongshu usernames, nicknames, avatars, or profile links. It writes author_hash instead:

sha256("xiaohongshu:" + raw_author_id)

The preferred raw author id is user_id. If unavailable, the collector may hash another available author field such as nickname or profile link, then discard the original value.

Safety And Compliance Boundaries

Collect only publicly accessible content.
Do not bypass login, captcha, rate limits, or platform risk controls.
If captcha is triggered, stop and ask the user to handle it manually.
If login/cookie is invalid, stop and ask the user to log in again.
Do not save raw usernames, nicknames, avatars, or profile links.
Use author_hash only for deduplication and structured analysis.
Do not call comment.py, interact.py, or publish.py for demand discovery.
Do not run large high-frequency collections.
Keep first-version usage small, conservative, and reviewable.

Troubleshooting

Not logged in: run python -m scripts qrcode --headless=false
Cookie expired: login again by QR code
Captcha triggered: stop collection, wait, and handle verification manually in visible browser mode
Empty comments output: reduce batch size, test with --headless=false, and confirm comments load on the note page
Too many publish_time_unknown: Xiaohongshu may have changed detail fields; inspect raw note detail structure before relying on recent-day filtering
Detail failures from search results: demand-discovery uses pc_search as xsec_source; if this becomes unstable, test the original feed command behavior with pc_feed

安全使用建议

Install only if you trust this publisher with your Xiaohongshu session and are comfortable with a skill that can both collect public content and perform live account actions. Prefer using only demand-discovery/search/feed commands, avoid comment/interact/publish commands unless you explicitly intend them, protect or delete ~/.xiaohongshu/cookies.json when done, and pin/update dependencies before serious use.

能力评估

⚠ Purpose & Capability

The demand-discovery collector is coherent and documented, but the package also exposes comment, reply, like, collect, and publish commands for a live Xiaohongshu account, which is broader than the main research/data-collection purpose.

⚠ Instruction Scope

Docs tell agents not to use publishing or interaction commands for demand discovery, but those commands remain invokable and some perform live actions without a final interactive confirmation.

ℹ Install Mechanism

Installation uses normal Python requirements and Playwright Chromium setup; dependencies are unpinned and include scanner-reported vulnerable version ranges.

ℹ Credentials

Playwright browser automation and local output files are expected for Xiaohongshu collection, but the default run persists collected notes/comments, source URLs, and author hashes under the skill directory.

⚠ Persistence & Privilege

The client saves Xiaohongshu browser cookies, including session material, to ~/.xiaohongshu/cookies.json without an explicit permission-hardening step, and live account actions reuse that session.

版本历史

v0.1.0

Xiaohongshu Demand Discovery Collector initial release. - Adds a demand discovery mode for collecting recent high-interaction Xiaohongshu notes and comments using demand-related keywords, optimized for downstream LLM analysis. - Provides QR-code login, keyword search with filters, note detail, comment loading, user profile, and explore feed extraction. - Demand discovery cleans and deduplicates comments, removes user identity fields, and outputs structured files. - Designed for small-scale, research, or product validation use—does not bypass login, captchas, or risk controls. - Emphasizes privacy (no raw usernames/avatars saved) and conservative, reviewable collection. - Offers clear CLI usage, parameters, and troubleshooting instructions.

元数据

Slug xiaohongshu-demand-discovery-skill

版本 0.1.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

Xiaohongshu Demand Discovery 是什么？

Collect recent high-interaction Xiaohongshu notes and cleaned comments using demand-style keywords for small-scale user need discovery and analysis. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 42 次。

如何安装 Xiaohongshu Demand Discovery？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install xiaohongshu-demand-discovery-skill」即可一键安装，无需额外配置。

Xiaohongshu Demand Discovery 是免费的吗？

是的，Xiaohongshu Demand Discovery 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

Xiaohongshu Demand Discovery 支持哪些平台？

Xiaohongshu Demand Discovery 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Xiaohongshu Demand Discovery？

由 Zev（@zev55555）开发并维护，当前版本 v0.1.0。

Xiaohongshu Demand Discovery