← Back to Skills Marketplace
shuishouxinboda

网页内容提取小助手

by shuishouxinboda · GitHub ↗ · v1.0.3 · MIT-0
cross-platform ✓ Security Clean
134
Downloads
1
Stars
0
Active Installs
4
Versions
Install in OpenClaw
/install jiayinclaw-12345
Description
从网页URL中提取标题、正文、图片链接等内容
README (SKILL.md)

网页内容提取器

这是一个实用的网页内容提取技能,可以从任意网页中提取结构化信息。

功能特点

  • 自动提取网页标题和元数据
  • 提取正文内容并清理HTML标签
  • 提取所有图片链接
  • 提取所有外链
  • 支持指定提取元素
  • 输出格式化JSON结果

使用方法

基本用法

技能输入:https://example.com
技能输出:{"title": "...", "content": "...", "images": [...], "links": [...]}

高级用法

  • 指定提取特定元素
  • 设置内容长度限制
  • 自定义输出格式

技术规格

  • 编程语言:Python 3
  • 依赖库:requests, beautifulsoup4
  • 网络要求:需要互联网连接
Usage Guidance
This appears to be a straightforward web scraper. Before installing: (1) run it in a sandboxed or virtualenv environment and review/inspect scripts (the code is short and readable); (2) only pass URLs you trust — do not use it on internal dashboards or pages containing secrets; (3) respect robots.txt and site terms; (4) install dependencies via pip in an isolated environment; (5) if you need stronger guarantees, run it with network egress controls so it can only reach target sites.
Capability Analysis
Type: OpenClaw Skill Name: jiayinclaw-12345 Version: 1.0.3 The skill bundle is a standard web content extraction tool that uses the requests and BeautifulSoup4 libraries to scrape titles, text, and images from user-provided URLs. The code in scripts/extractor.py follows its stated purpose without any signs of data exfiltration, malicious execution, or prompt injection attempts.
Capability Assessment
Purpose & Capability
Name/description (extract titles, content, images, links) match the included script and SKILL.md. Required libraries (requests, BeautifulSoup) are appropriate for the stated purpose and no unrelated binaries or credentials are requested.
Instruction Scope
SKILL.md and the script instruct only to fetch the target URL and parse its HTML. The runtime behavior is limited to requesting the provided URL, parsing content, and returning structured data; it does not read local files, access environment variables, or POST data to external endpoints other than the target site.
Install Mechanism
There is no automated install spec (no downloads or installers), which lowers risk. The package includes a Python script and requirements.txt that expect dependencies to be installed via pip; users should ensure dependencies are installed in a controlled environment (virtualenv) before running.
Credentials
The skill requires no environment variables, credentials, or config paths. The permissions indicated (network) are proportional and necessary for fetching webpages.
Persistence & Privilege
The skill does not request always:true, does not modify other skills or system-wide settings, and does not store credentials. Autonomous invocation is allowed by default but presents no additional incoherence here.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install jiayinclaw-12345
  3. After installation, invoke the skill by name or use /jiayinclaw-12345
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.3
Version 1.0.3 - No file changes detected in this release. - Functionality and documentation remain unchanged from the previous version.
v1.0.2
- Version bump to 1.0.2 with no file or documentation changes. - No functional updates or changes in this release.
v1.0.1
Version 1.0.1 - No changes detected from the previous version. - All documentation, code, and configuration remain the same.
v1.0.0
Initial release of web-content-extractor: - Extracts structured information (title, metadata, main text, images, links) from any web page URL. - Supports extraction of specific elements and output in formatted JSON. - Cleans HTML tags from the main content. - Allows setting content length limits and customizing output format. - Requires Python 3, requests, and beautifulsoup4; needs internet access.
Metadata
Slug jiayinclaw-12345
Version 1.0.3
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 4
Frequently Asked Questions

What is 网页内容提取小助手?

从网页URL中提取标题、正文、图片链接等内容. It is an AI Agent Skill for Claude Code / OpenClaw, with 134 downloads so far.

How do I install 网页内容提取小助手?

Run "/install jiayinclaw-12345" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is 网页内容提取小助手 free?

Yes, 网页内容提取小助手 is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does 网页内容提取小助手 support?

网页内容提取小助手 is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created 网页内容提取小助手?

It is built and maintained by shuishouxinboda (@shuishouxinboda); the current version is v1.0.3.

💬 Comments