wechat-article-extraction-mp-weixin-qq-com news-webpage-cleaning blog-post-parsing metadata-extraction-title-author-date multiple-output-formats-markdown-json-plain-text batch-processing-support

Name: wechat-article-extraction-mp-weixin-qq-com news-webpage-cleaning blog-post-parsing metadata-extraction-title-author-date multiple-output-formats-markdown-json-plain-text batch-processing-support
Author: 3511815125

by Yu Jia Li · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ⚠ suspicious

327

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install web-fetch-vx

Description

基于三引擎设计，从微信文章、新闻和博客网页提取干净内容，支持标题作者日期元数据，多格式和批量处理。

Usage Guidance

The skill appears to do what it says, but provenance and implementation details are missing. Before installing or using it: 1) Ask the publisher for the source repository or a definitive install/run plan (how are the listed dependencies and the 'browser' engine provided?). 2) Verify where the code would execute (your machine, OpenClaw-hosted runner, third-party server) and whether extracted content will be transmitted off-site. 3) Confirm the runtime has a headless browser if you expect JS-rendered pages to work. 4) Test with non-sensitive URLs first and avoid sending private pages or secrets through proxy parameters until you trust the implementation. If the publisher cannot provide a code repo or clear install/run instructions, treat the skill as untrusted.

Capability Analysis

Type: OpenClaw Skill Name: web-fetch-vx Version: 1.0.0 The provided files consist of metadata and documentation for a web content extraction tool designed to scrape and clean articles from sources like WeChat and news sites. The SKILL.md file provides standard technical instructions, usage examples, and configuration parameters (e.g., URL, extractMode, proxy) consistent with its stated purpose. There is no executable code present in the snippet, and the documentation contains no evidence of prompt injection, malicious instructions, or data exfiltration attempts.

Capability Assessment

ℹ Purpose & Capability

Name/description and the SKILL.md align: it is a web content extractor for WeChat/news/blogs and lists reasonable features (Readability-like extraction, metadata, multi-format output, batch support). Declared dependencies (readability, firecrawl, defuddle) are plausible for this purpose. However, the skill advertises a 'browser' engine (for JS-rendered pages) but does not declare any binaries (headless browser, chrome, puppeteer) or an install spec—an implementation that supports a browser engine would normally require those, so this is an unexplained gap.

✓ Instruction Scope

SKILL.md contains concrete runtime instructions/examples limited to fetching and extracting public web content. It explicitly excludes login/paywalled/captcha-protected content and states to respect robots.txt. It does not instruct reading unrelated files or environment variables, nor sending data to unexpected external endpoints. The skill allows user-supplied proxy and user-agent configuration, which is reasonable for a fetcher but gives the caller control over network routing.

⚠ Install Mechanism

This is an instruction-only skill with no install spec and no code files, but SKILL.md lists NPM-like dependencies and describes multiple engines including a browser engine. There's no guidance where those packages come from, no URLs or package manager instructions, and no declared required binaries (e.g., headless chrome). That inconsistency means it's unclear how or where the declared functionality would be satisfied — a consumer should ask for implementation/install details before trusting it.

✓ Credentials

The skill requires no environment variables or credentials and does not request access to system config paths. It exposes parameters for proxy and user-agent; those are user-supplied options and not implicit requests for secrets. This is proportionate to the stated functionality. Note: using a proxy or remote execution environment could expose extracted content to third parties if misconfigured by the user.

✓ Persistence & Privilege

Skill flags show no elevated privileges: always is false, no install spec creates no persistent binaries, and the skill does not ask to modify other skills or system-wide settings. Autonomous model invocation is enabled (platform default) — combined with the other issues this increases impact but is not itself a misconfiguration.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install web-fetch-vx
After installation, invoke the skill by name or use /web-fetch-vx
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

- Major update to version 2.0 with a new name: “Web Content Extractor - 网页内容提取器”. - Now uses a 3-engine (Readability + Firecrawl + Defuddle) architecture for improved extraction success and versatility. - Enhanced support for WeChat articles, news sites, and blogs, with automatic ad/sidebar removal. - Clean output in Markdown, JSON, or plain text, with batch processing and metadata extraction. - Adds advanced options: custom User-Agent, proxy, and caching controls. - Includes comprehensive usage scenarios, performance metrics, and troubleshooting guide.

Metadata

Slug web-fetch-vx

Version 1.0.0

License MIT-0

All-time Installs 2

Active Installs 2

Total Versions 1

Frequently Asked Questions

What is wechat-article-extraction-mp-weixin-qq-com news-webpage-cleaning blog-post-parsing metadata-extraction-title-author-date multiple-output-formats-markdown-json-plain-text batch-processing-support?

基于三引擎设计，从微信文章、新闻和博客网页提取干净内容，支持标题作者日期元数据，多格式和批量处理。 It is an AI Agent Skill for Claude Code / OpenClaw, with 327 downloads so far.

How do I install wechat-article-extraction-mp-weixin-qq-com news-webpage-cleaning blog-post-parsing metadata-extraction-title-author-date multiple-output-formats-markdown-json-plain-text batch-processing-support?

Run "/install web-fetch-vx" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is wechat-article-extraction-mp-weixin-qq-com news-webpage-cleaning blog-post-parsing metadata-extraction-title-author-date multiple-output-formats-markdown-json-plain-text batch-processing-support free?

Yes, wechat-article-extraction-mp-weixin-qq-com news-webpage-cleaning blog-post-parsing metadata-extraction-title-author-date multiple-output-formats-markdown-json-plain-text batch-processing-support is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does wechat-article-extraction-mp-weixin-qq-com news-webpage-cleaning blog-post-parsing metadata-extraction-title-author-date multiple-output-formats-markdown-json-plain-text batch-processing-support support?

wechat-article-extraction-mp-weixin-qq-com news-webpage-cleaning blog-post-parsing metadata-extraction-title-author-date multiple-output-formats-markdown-json-plain-text batch-processing-support is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created wechat-article-extraction-mp-weixin-qq-com news-webpage-cleaning blog-post-parsing metadata-extraction-title-author-date multiple-output-formats-markdown-json-plain-text batch-processing-support?

It is built and maintained by Yu Jia Li (@3511815125); the current version is v1.0.0.

More Skills