← 返回 Skills 市场

Web Extractor

Name: Web Extractor
Author: kukuxnd

作者 kukuxNd · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ⚠ suspicious

323

总下载

当前安装

版本数

在 OpenClaw 中安装

/install web-extractor

功能描述

使用 jina.ai 提取网页干净文本并让 Agent 总结。触发词：提取网页、总结新闻、提取文章、获取页面内容

安全使用建议

This skill behaves as advertised (it delegates extraction to r.jina.ai then summarizes the returned markdown), but it will cause your requested URL and the fetched page content to be fetched and processed by a third-party service. Before installing or using it, consider: - Do not send sensitive, private, or internal URLs (intranets, private docs, or cloud metadata endpoints like 169.254.169.254) — doing so can leak secrets or enable SSRF via the remote extractor. - Treat r.jina.ai as an external party: any content fetched for summarization will be disclosed to them. Verify you trust that service or host an extractor locally. - The skill writes to predictable /tmp filenames; if you must use it, prefer changing the workflow to use a secure temporary filename (e.g., mktemp) to avoid collisions or exposure. - If you need to summarize protected content, fetch the page locally (ensuring credentials are handled safely), sanitize/remove sensitive headers or query params, and run a local extraction/parsing step instead of sending the raw URL to a public extractor. If you want a safer alternative, ask for a version that accepts raw HTML you provide (so you control what is sent externally) or for instructions to run a local HTML-to-text tool rather than delegating fetching to r.jina.ai.

功能分析

Type: OpenClaw Skill Name: web-extractor Version: 1.0.0 The web-extractor skill is designed to fetch and clean web content using the r.jina.ai service for AI summarization. The workflow in SKILL.md uses standard curl commands to retrieve data and store it in temporary files (/tmp/web-content.md), which is consistent with its stated purpose and shows no signs of malicious intent or data exfiltration.

能力评估

✓ Purpose & Capability

The name/description match the instructions: the SKILL.md tells the agent to fetch a page via r.jina.ai and summarize the resulting markdown. No unrelated binaries, installs, or credentials are requested.

⚠ Instruction Scope

The instructions instruct the agent to POST the target page URL to an external service (https://r.jina.ai/...) and save the result to /tmp, then read and summarize that file. This is within the stated function but has privacy/security implications that the skill does not address: arbitrary URLs (including internal intranet or metadata endpoints) will be fetched by the remote service, and page contents are disclosed to a third party. The instructions also use a predictable /tmp filename, which can create local information exposure or race conditions.

✓ Install Mechanism

Instruction-only skill with no install spec and no code files — nothing is written to disk by an installer. Lowest install risk.

✓ Credentials

The skill requests no environment variables, credentials, or config paths. There is no overbroad credential access declared.

✓ Persistence & Privilege

The skill does not request permanent presence (always: false) and does not modify agent/system configs. Agent-autonomous invocation is allowed by default, which is expected and not by itself a red flag.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install web-extractor
安装完成后，直接呼叫该 Skill 的名称或使用 /web-extractor 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

- Initial release of web-extractor skill. - Extracts clean text from web pages using r.jina.ai, removing scripts, navigation, ads, and unnecessary CSS. - Allows easy summarization of core content by the Agent. - Supports extracting from any news site, tech blog, or article page. - Saved content is in pure text format for optimal AI processing. - Default output path is /tmp/, with customizable file locations.

元数据

Slug web-extractor

版本 1.0.0

许可证 MIT-0

累计安装 1

当前安装数 1

历史版本数 1

常见问题

Web Extractor 是什么？

使用 jina.ai 提取网页干净文本并让 Agent 总结。触发词：提取网页、总结新闻、提取文章、获取页面内容. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 323 次。

如何安装 Web Extractor？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install web-extractor」即可一键安装，无需额外配置。

Web Extractor 是免费的吗？

是的，Web Extractor 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

Web Extractor 支持哪些平台？

Web Extractor 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Web Extractor？

由 kukuxNd（@kukuxnd）开发并维护，当前版本 v1.0.0。