← 返回 Skills 市场
lilw-yezi

Webpage Export

作者 Yeziwnl · GitHub ↗ · v1.0.1 · MIT-0
cross-platform ⚠ suspicious
237
总下载
0
收藏
0
当前安装
2
版本数
在 OpenClaw 中安装
/install webpage-export
功能描述
Export webpages into clean local TXT, DOCX, and PDF files with source metadata, fallback extraction logic, and browser-assisted recovery for difficult pages....
安全使用建议
This skill appears to do what it claims, but take these precautions before installing/using it: 1) Ensure you have python3, curl, node, the Node 'playwright' package, and Chrome/Chromium (and textutil on macOS) installed — the registry metadata does not list these, so failure modes are likely if they are missing. 2) Run the tool in a controlled workspace (explicit --outdir) and avoid running it against untrusted internal URLs: the headless browser will execute page JavaScript which can trigger network calls or other side effects originating from the target page. 3) Because the skill owner is unknown, review and test on safe pages first. 4) If you need an automated install of dependencies, add or request an install spec from the publisher before wider deployment.
功能分析
Type: OpenClaw Skill Name: webpage-export Version: 1.0.1 The skill bundle provides a tool for exporting webpages using curl, headless Chrome, and Node.js/Playwright. While the logic is aligned with the stated purpose, the script `scripts/export_webpage.py` lacks URL scheme validation, allowing for potential Local File Inclusion (LFI) via 'file://' URLs. It also permits writing files to arbitrary locations through the '--outdir' parameter and executes an embedded Node.js script, which increases the attack surface for an AI agent if it is manipulated into accessing sensitive local resources.
能力评估
Purpose & Capability
Name/description (export webpages to TXT/DOCX/PDF) align with the included script and reference docs. The script fetches arbitrary URLs, extracts text/metadata, optionally uses Chrome/Chromium and Playwright for rendering, and emits JSON metadata — all expected for this purpose. Minor mismatch: registry metadata lists no required binaries, but SKILL.md and the script clearly require python3, curl, node+playwright, and Chrome/Chromium (and textutil on macOS).
Instruction Scope
SKILL.md instructions are narrowly scoped to running scripts/export_webpage.py with flags and reading the included references. The runtime behavior (curl fetch, HTML parsing, optional headless browser execution, local file writes) is explicitly documented. The SKILL.md warns the browser fallback will execute page JavaScript. The instructions do not direct the agent to read unrelated system files or transmit data to unexpected external endpoints.
Install Mechanism
This is an instruction-only skill with a bundled script and no install spec. That is low-risk, but practical friction exists: the skill expects runtime dependencies (python3, curl, node, the Node 'playwright' package, and Chrome/Chromium) without providing automated installation. There's no evidence of downloads from untrusted hosts or hidden installers in the bundle.
Credentials
The skill declares no required credentials or special env vars; the script only reads PATH and HOME from the environment and sets CHROME_BIN for child processes (pointing to a local Chrome path it finds). No secrets or unrelated credentials are requested.
Persistence & Privilege
The skill is not marked always:true and does not request system- or agent-wide configuration changes. It runs on-demand and writes outputs under local output folders; no elevated persistence or cross-skill config writes are present in the provided files.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install webpage-export
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /webpage-export 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.1
- Updates default output folder behavior: if no --outdir is specified, outputs now default to a local export folder under the current working directory. - Adds runtime requirements section clarifying dependencies such as python3, curl, Chrome/Chromium, node, and playwright for various export functions. - Adds safety and execution notes, particularly regarding headless browser usage and best practices for production environments. - Example commands and documentation reflect new output and requirements, replacing hardcoded paths with generic, workspace-relative locations. - No functional code changes—documentation update for improved clarity and user guidance.
v1.0.0
Initial release of webpage-export. Export webpages into clean TXT, DOCX, and PDF files. Preserve source metadata including title, URL, source, and publish-time when available. Add fallback extraction logic and browser-assisted recovery for difficult pages. Support webpage archiving for articles, policy pages, WeChat posts, and official notices.
元数据
Slug webpage-export
版本 1.0.1
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 2
常见问题

Webpage Export 是什么?

Export webpages into clean local TXT, DOCX, and PDF files with source metadata, fallback extraction logic, and browser-assisted recovery for difficult pages.... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 237 次。

如何安装 Webpage Export?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install webpage-export」即可一键安装,无需额外配置。

Webpage Export 是免费的吗?

是的,Webpage Export 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Webpage Export 支持哪些平台?

Webpage Export 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Webpage Export?

由 Yeziwnl(@lilw-yezi)开发并维护,当前版本 v1.0.1。

💬 留言讨论