← Back to Skills Marketplace
robertstarry-gif

Liuzln Openclaw Skills Wechat Article Fetcher

by robertstarry-gif · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ Security Clean
96
Downloads
0
Stars
1
Active Installs
1
Versions
Install in OpenClaw
/install liuzln-openclaw-skills-wechat-article-fetcher
Description
Fetch and save WeChat Official Account articles with full content and images. Supports any WeChat article URL, automatic image download, JSON export, and ful...
README (SKILL.md)

WeChat Article Fetcher Skill

微信公众号文章爬取 Skill,支持任意微信公众号文章 URL,自动下载图片,导出 JSON,保存完整截图。

功能特性

  • ✅ 支持任意微信公众号文章 URL
  • ✅ 自动提取标题、作者、发布时间
  • ✅ 提取完整文章内容
  • ✅ 自动下载并保存文章图片
  • ✅ 保存完整页面截图
  • ✅ 导出 JSON 格式结果
  • ✅ 支持虚拟环境(如 playwright-env)
  • ✅ 无头模式(无 UI 服务器部署)
  • ✅ 命令行工具支持

快速开始

1. 使用命令行工具

# 爬取单篇文章
python3 skills/wechat-article-fetcher/scripts/fetch.py \
  https://mp.weixin.qq.com/s/HTGvy5C6SYyr5XQhTfTfzw

# 指定输出目录
python3 skills/wechat-article-fetcher/scripts/fetch.py \
  https://mp.weixin.qq.com/s/HTGvy5C6SYyr5XQhTfTfzw \
  --output ./my_articles

# 不保存图片
python3 skills/wechat-article-fetcher/scripts/fetch.py \
  https://mp.weixin.qq.com/s/HTGvy5C6SYyr5XQhTfTfzw \
  --no-images

2. 使用虚拟环境运行

# 方式 1: 激活虚拟环境后运行
source playwright-env/bin/activate
python3 skills/wechat-article-fetcher/scripts/fetch.py \x3Curl>

# 方式 2: 使用提供的包装脚本
python3 skills/wechat-article-fetcher/scripts/run_in_venv.py \
  \x3Curl> --venv /path/to/playwright-env

3. 在代码中使用

from wechat_article_fetcher import fetch_article

# 爬取文章
result = fetch_article("https://mp.weixin.qq.com/s/HTGvy5C6SYyr5XQhTfTfzw")

print(f"标题: {result['title']}")
print(f"作者: {result['author']}")
print(f"内容长度: {result['length']}")
print(f"图片数量: {result['images_count']}")

命令行工具

fetch.py - 主要爬取工具

python3 skills/wechat-article-fetcher/scripts/fetch.py [OPTIONS] URL

参数:
  URL                   微信公众号文章 URL(必需)

选项:
  -o, --output PATH     输出目录(默认: ./wechat_articles)
  --no-images           不保存图片
  --no-screenshot       不保存截图
  --headless BOOLEAN    无头模式(默认: true)
  --timeout INTEGER     超时时间(毫秒,默认: 60000)
  -h, --help            显示帮助信息

batch_fetch.py - 批量爬取工具

# 从文件中读取 URL 列表
python3 skills/wechat-article-fetcher/scripts/batch_fetch.py \
  --urls-file urls.txt

# 或者直接在命令行指定多个 URL
python3 skills/wechat-article-fetcher/scripts/batch_fetch.py \
  --urls "url1" "url2" "url3"

run_in_venv.py - 虚拟环境运行工具

python3 skills/wechat-article-fetcher/scripts/run_in_venv.py \
  \x3Curl> --venv /path/to/playwright-env

输出说明

每次爬取会创建一个以时间戳命名的目录:

wechat_articles/
└── 20260302_125500/
    ├── article.json          # JSON 格式结果
    ├── article.png           # 完整页面截图
    └── images/               # 文章图片目录
        ├── 001_image1.jpg
        ├── 002_image2.png
        └── ...

article.json 结构

{
  "title": "文章标题",
  "author": "作者名称",
  "publish_date": "2026-03-02",
  "url": "https://mp.weixin.qq.com/s/...",
  "content": "完整文章内容...",
  "images": [
    {
      "index": 1,
      "url": "https://mmbiz.qpic.cn/...",
      "alt": "图片描述",
      "filename": "001_image.jpg",
      "success": true
    }
  ],
  "images_count": 5,
  "images_dir": "wechat_articles/20260302_125500/images",
  "fetch_time": "2026-03-02T12:55:00.000000",
  "length": 15000
}

配置文件

可以创建 config.json 来自定义默认配置:

{
  "headless": true,
  "timeout": 60000,
  "output_dir": "./wechat_articles",
  "save_images": true,
  "save_screenshot": true,
  "user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36..."
}

使用示例

示例 1: 爬取单篇文章

python3 skills/wechat-article-fetcher/scripts/fetch.py \
  https://mp.weixin.qq.com/s/HTGvy5C6SYyr5XQhTfTfzw

示例 2: 批量爬取

创建 urls.txt

https://mp.weixin.qq.com/s/xxx
https://mp.weixin.qq.com/s/yyy
https://mp.weixin.qq.com/s/zzz

然后运行:

python3 skills/wechat-article-fetcher/scripts/batch_fetch.py \
  --urls-file urls.txt

示例 3: 在虚拟环境中使用

python3 skills/wechat-article-fetcher/scripts/run_in_venv.py \
  https://mp.weixin.qq.com/s/HTGvy5C6SYyr5XQhTfTfzw \
  --venv /home/user/playwright-env

最佳实践

  1. 使用虚拟环境: 隔离依赖,避免冲突
  2. 合理设置超时: 根据网络情况调整
  3. 批量爬取时添加延迟: 避免给服务器造成压力
  4. 定期检查输出: 确保图片和内容完整
  5. 遵守 robots.txt: 尊重网站的爬取规则

故障排除

问题: 找不到模块

解决方案: 确保在正确的虚拟环境中运行,或安装依赖:

pip install playwright
playwright install chromium

问题: 浏览器无法启动

解决方案: 安装系统依赖:

sudo apt install -y libnss3 libnspr4 libatk1.0-0 libatk-bridge2.0-0 \
  libcups2 libdrm2 libxkbcommon0 libxcomposite1 libxdamage1 \
  libxfixes3 libxrandr2 libgbm1 libasound2

问题: 图片下载失败

解决方案: 检查网络连接,或使用 --no-images 跳过图片下载。

问题: 页面加载超时

解决方案: 增加超时时间:

python3 skills/wechat-article-fetcher/scripts/fetch.py \
  \x3Curl> --timeout 90000

相关资源

更新日志

v1.0.0

  • 初始版本
  • 支持单篇文章爬取
  • 支持图片下载
  • 支持 JSON 导出
  • 支持完整页面截图
  • 支持虚拟环境
  • 提供命令行工具
  • 提供批量爬取工具
Usage Guidance
This skill appears to do what it says (scrape WeChat article pages using Playwright). Before installing or running: (1) install Playwright and the browser (pip install playwright; playwright install chromium) in an isolated virtualenv; (2) review and, if needed, change the VENV_PATH default in fetch_direct.py if you use that helper (it defaults to /opt/playwright-env); (3) be aware the scripts write files (screenshots, images, JSON) into local directories—store them in an appropriate location and avoid running as root; (4) respect site terms and robots.txt when scraping; and (5) if you need higher assurance, run the code in a disposable environment and inspect network activity during a test run.
Capability Analysis
Type: OpenClaw Skill Name: liuzln-openclaw-skills-wechat-article-fetcher Version: 1.0.0 The skill bundle is a legitimate tool for scraping and archiving WeChat Official Account articles using Playwright. It includes features for extracting text, downloading images, and taking full-page screenshots, with support for batch processing and virtual environments. The implementation includes basic security checks, such as validating that URLs belong to the 'mp.weixin.qq.com' domain in `scripts/fetch.py`, and uses safe subprocess calls with argument lists in `wechat_article_fetcher.py` and `scripts/run_in_venv.py` to manage execution.
Capability Assessment
Purpose & Capability
Name/description match the included files and behavior: scripts and modules implement page loading, content extraction, image download, screenshots, and JSON export for mp.weixin.qq.com articles. No unrelated services or credentials are requested.
Instruction Scope
SKILL.md and the code instruct the agent to run local Python scripts and Playwright to visit WeChat article URLs, extract DOM content, download images, and write local files. The instructions do not ask the agent to read unrelated system files, exfiltrate data to external endpoints, or access other credentials.
Install Mechanism
No install spec is provided (instruction-only), which minimizes automatic risk but means the user must install dependencies (playwright and browsers) manually. The code expects Playwright and a browser runtime; this is reasonable for the stated purpose but worth noting since the skill will not auto-install its runtime.
Credentials
The skill does not declare or require environment variables, secrets, or external API keys. The only configuration is optional: paths, timeouts, headless flag, and a hard-coded VENV_PATH default in fetch_direct.py (/opt/playwright-env) which is configurable by the user. No credentials or unrelated secrets are requested.
Persistence & Privilege
The skill is not marked always:true and does not modify other skills or system-wide agent settings. It runs as normal CLI/Python code and writes outputs to local directories under the user-specified output path.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install liuzln-openclaw-skills-wechat-article-fetcher
  3. After installation, invoke the skill by name or use /liuzln-openclaw-skills-wechat-article-fetcher
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
WeChat Article Fetcher Skill v1.0.0 - Initial release. - Fetch any WeChat Official Account article by URL. - Automatically download article images. - Export article data as JSON. - Save full-page screenshots of articles. - Support for virtual environments and headless mode. - Command-line and batch processing tools included.
Metadata
Slug liuzln-openclaw-skills-wechat-article-fetcher
Version 1.0.0
License MIT-0
All-time Installs 1
Active Installs 1
Total Versions 1
Frequently Asked Questions

What is Liuzln Openclaw Skills Wechat Article Fetcher?

Fetch and save WeChat Official Account articles with full content and images. Supports any WeChat article URL, automatic image download, JSON export, and ful... It is an AI Agent Skill for Claude Code / OpenClaw, with 96 downloads so far.

How do I install Liuzln Openclaw Skills Wechat Article Fetcher?

Run "/install liuzln-openclaw-skills-wechat-article-fetcher" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Liuzln Openclaw Skills Wechat Article Fetcher free?

Yes, Liuzln Openclaw Skills Wechat Article Fetcher is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Liuzln Openclaw Skills Wechat Article Fetcher support?

Liuzln Openclaw Skills Wechat Article Fetcher is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Liuzln Openclaw Skills Wechat Article Fetcher?

It is built and maintained by robertstarry-gif (@robertstarry-gif); the current version is v1.0.0.

💬 Comments