公众号作者文章抓取

Name: 公众号作者文章抓取
Author: dazhuangjammy

Description

用本地微信公众号抓取器批量识别并拉取某个公众号作者的历史文章，输出 Markdown、HTML 和 articles.json，供后续做作者语料库、风格拆解、仿写模板、事实核验和内容归档。只要用户提到“抓某个公众号的文章”“下载最近 20/50/100 篇公众号文章”“我给你一篇链接，你继续把这个号的文章都扒下来...

Usage Guidance

This skill bundles a local scraper and will run code on your machine: the launcher will create a Python virtualenv and pip-install dependencies (including Playwright), then drive a headless or visible browser to log into your WeChat public account by QR scan and download articles to a local folder. Before installing or running: (1) inspect the included files (scripts/main.py, run_fetcher.sh, config.json) and confirm you trust them; (2) be prepared to scan a QR from a trusted UI — do not scan QR images from untrusted sources or send your login artifacts to others; (3) run this on a machine/account you control (avoid shared machines) because the tool stores login caches (.playwright-profile, login_artifacts); (4) be aware pip install will fetch packages from PyPI and Playwright may download browser runtimes; (5) note the SKILL.md’s automatic trigger on casual phrasing — if you want to avoid accidental runs, require explicit confirmation before executing the fetch workflow. If you need higher assurance, run the tool inside an isolated VM or container and/or review the full main.py for any data-exfiltration code (we reviewed the bundle and found behavior consistent with a local scraper).

Capability Analysis

Type: OpenClaw Skill Name: wechat-articles-crawler Version: 0.1.0 The skill bundle is a legitimate tool designed to automate the fetching and archiving of WeChat Official Account articles into Markdown and HTML formats. It uses Playwright for browser automation to handle WeChat's login requirements (via QR code) and session management, with all sensitive data (cookies and profiles) stored strictly in local directories (`.playwright-profile` and `login_artifacts`). The code in `main.py` includes explicit privacy safeguards, such as a `clear-login` command to wipe session data and instructions in `SKILL.md` advising the agent not to exfiltrate or display tokens. System calls via `subprocess` are limited to standard UI interactions like macOS `osascript` for folder selection and `qlmanage` for displaying QR codes.

Capability Assessment

✓ Purpose & Capability

Name/description match the provided files and behavior. The skill includes a local Python fetcher, a shell launcher, and configuration; requiring no cloud credentials or unrelated binaries is consistent with a local Playwright-based scraper.

ℹ Instruction Scope

SKILL.md instructs the agent to run the bundled scripts, manage config.json, ensure login via QR, poll login-status, and send the generated QR image to the user when appropriate — all of which is coherent with the fetcher purpose. Two points to note: (1) the skill is written to trigger on casual user phrasing (e.g., "把这个号最近的文章弄下来"), which could cause it to activate more often than a user expects; (2) the workflow involves exposing generated QR image/text files to an agent/user for scanning, which is functionally necessary but should be handled carefully to avoid unintended disclosure.

ℹ Install Mechanism

There is no external install spec in the registry metadata, but the included run_fetcher.sh will create a Python virtualenv and pip-install requirements (Playwright, httpx, BeautifulSoup, etc.). Installing packages from PyPI and using Playwright is expected for this tool; Playwright may require separate browser runtime installation (the README documents this). This is moderate-risk but proportional to the stated purpose and uses standard package sources (requirements.txt).

✓ Credentials

The skill does not request unrelated environment variables or secrets. It optionally respects environment overrides for profile and artifact dirs, and it stores login artifacts locally (.playwright-profile, login_artifacts). The tool returns local paths (profile_dir, login_artifacts_dir) in its JSON output — these are not credentials by themselves but could help locate sensitive files, so agents should avoid exposing token/cookie contents; the SKILL.md explicitly warns not to reveal cookie/token contents.

✓ Persistence & Privilege

The skill does not request always:true or other elevated platform privileges. It writes/reads its own runtime artifacts (.venv, .playwright-profile, login_artifacts) within the project; this is expected for a local scraper and does not modify other skills or global agent settings.

Version History

v0.1.0

公众号作者文章抓取 v0.1.0 — 首个版本上线 - 提供本地微信公众号文章批量抓取工具，支持按链接自动识别并下载目标公众号最近 N 篇文章。 - 默认输出 Markdown、HTML 和 articles.json 文件，方便后续语料库、风格拆解、仿写、归档等场景使用。 - 集成本地登录态管理，自动处理二维码登录和缓存，支持多种二维码展示模式。 - 通过主目录自带脚本一键调用，无需额外安装第三方项目。 - 严格隐私与安全边界，登录缓存仅限本机项目目录，主动支持清除登录状态。 - 标准化错误处理与结果汇报格式，提升用户体验和稳定性。

Metadata

Slug wechat-articles-crawler

Version 0.1.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is 公众号作者文章抓取?

用本地微信公众号抓取器批量识别并拉取某个公众号作者的历史文章，输出 Markdown、HTML 和 articles.json，供后续做作者语料库、风格拆解、仿写模板、事实核验和内容归档。只要用户提到“抓某个公众号的文章”“下载最近 20/50/100 篇公众号文章”“我给你一篇链接，你继续把这个号的文章都扒下来... It is an AI Agent Skill for Claude Code / OpenClaw, with 158 downloads so far.

How do I install 公众号作者文章抓取?

Run "/install wechat-articles-crawler" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is 公众号作者文章抓取 free?

Yes, 公众号作者文章抓取 is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does 公众号作者文章抓取 support?

公众号作者文章抓取 is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created 公众号作者文章抓取?

It is built and maintained by 大壮/Jammy (@dazhuangjammy); the current version is v0.1.0.

More Skills

What is 公众号作者文章抓取?

How do I install 公众号作者文章抓取?

Is 公众号作者文章抓取 free?

Which platforms does 公众号作者文章抓取 support?

Who created 公众号作者文章抓取?

💬 Comments