← 返回 Skills 市场
282
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install adaptive-web-analyzer
功能描述
通过指定接口获取网页内容,自适应抓取解析关键文本,并使用大模型进行智能梳理总结
使用说明 (SKILL.md)
功能概述
当用户需要获取网页内容并进行智能分析时,本技能将:
- 通过用户指定的API接口或URL获取原始网页内容
- 使用自适应解析器提取关键文本(自动处理反爬、动态渲染、布局变化)
- 将提取的文本发送给大模型进行结构化梳理和总结
- 返回格式化的分析报告
触发场景
用户输入包含以下意图时触发:
- "抓取[某网址]的内容并分析"
- "获取[API接口]的数据并整理"
- "分析网页[URL]的关键信息"
- "爬取[网站]并用AI总结"
- "提取[链接]的文本并梳理"
执行流程
步骤1: 获取网页内容
- 如果用户提供的是API接口:使用HTTP客户端发送请求(支持自定义Headers、Auth)
- 如果用户提供的是普通URL:使用自适应浏览器抓取(处理JavaScript渲染、反爬机制)
- 支持配置:超时时间、重试次数、User-Agent轮换、代理设置
步骤2: 自适应内容解析
使用以下策略提取关键文本:
- 智能选择器:基于内容相似度算法,自动定位正文区域(抗布局变化)
- 反反爬处理:自动绕过Cloudflare等基础防护(遵守robots.txt)
- 动态渲染:对SPA应用使用Playwright等待关键元素加载
- 噪声过滤:自动去除广告、导航栏、页脚等非内容元素
- 多格式支持:HTML、JSON API响应、Markdown页面
步骤3: 内容结构化
提取的文本按以下维度组织:
- 标题/主题
- 关键段落(按重要性排序)
- 列表/表格数据
- 时间戳/元数据
- 链接引用
步骤4: 大模型智能梳理
将结构化文本发送给LLM,执行以下分析:
- 摘要生成:生成3-5句话的核心摘要
- 要点提取:列出3-7个关键要点
- 分类标签:自动标注内容类别(技术/新闻/产品/学术等)
- 情感分析:判断内容倾向(积极/中性/消极)
- 实体识别:提取人名、组织、产品、地点等关键实体
- 行动建议:根据内容类型提供后续建议(如需要)
步骤5: 输出格式化
返回包含以下字段的JSON/Markdown报告:
{
"source_url": "原始链接",
"fetch_time": "抓取时间",
"content_stats": {
"total_chars": "总字符数",
"extracted_chars": "提取字符数",
"confidence_score": "抓取置信度"
},
"analysis": {
"summary": "AI生成的摘要",
"key_points": ["要点1", "要点2", "要点3"],
"category": "内容分类",
"sentiment": "情感倾向",
"entities": {
"persons": ["人物名"],
"organizations": ["组织名"],
"products": ["产品名"]
},
"suggested_actions": ["建议操作1", "建议操作2"]
},
"raw_content_preview": "原始内容前500字(可选)"
}
安全使用建议
This skill appears to do what it says (adaptive scraping + LLM summarization) but contains several broad capabilities you should review before enabling: 1) SKILL.md lists 'system.exec' and 'file.write' permissions — these allow running subprocesses and writing files locally; ensure you trust the code and run it in a sandbox. 2) The skill advertises 'stealth' / anti‑bot bypass features and explicitly mentions bypassing Cloudflare while claiming to respect robots.txt — that's contradictory and could be used to access protected resources. 3) It accepts arbitrary URLs, custom headers/auth, and proxy settings — do not supply sensitive internal URLs or credentials. 4) There is no install spec; required Python deps (playwright, scrapling) are optional but may be needed for full functionality. Recommended actions: review the full agent.py file (especially any parts not shown), run the skill in an isolated environment first, restrict its network access if possible, and avoid providing secrets or internal endpoints to the skill until you are comfortable with its behavior. If you want higher assurance, ask the author for an explicit explanation of why 'system.exec' and 'stealth' are necessary and for a minimal-permission mode.
功能分析
Type: OpenClaw Skill
Name: adaptive-web-analyzer
Version: 1.0.0
The skill requests high-risk permissions including 'system.exec' and 'file.write' in SKILL.md, which are not explicitly utilized in the provided agent.py logic. Furthermore, the analyze_with_llm function in agent.py constructs prompts using unsanitized scraped web content (title, metadata, and body), which presents a vulnerability to indirect prompt injection attacks if a target website contains malicious instructions designed to hijack the agent's LLM. While the code appears to be a legitimate web scraping utility, these broad permissions and lack of input sanitization for the LLM context are significant security risks.
能力评估
Purpose & Capability
Name and description match the included code and instructions: the skill fetches pages (requests/Playwright/optional 'scrapling') extracts text, and sends it to an LLM for summarization. Declared dependencies (requests, beautifulsoup4, html2text, optional playwright/scrapling) are consistent with web scraping and parsing.
Instruction Scope
SKILL.md and agent.py instruct the agent to fetch arbitrary user-provided URLs or API endpoints, support custom headers/auth, proxies, and a 'stealth' anti‑bot mode. SKILL.md explicitly lists permissions including 'system.exec' and 'file.write' (broad privileges). The doc also claims it can '绕过 Cloudflare 等基础防护' while '遵守 robots.txt' — this is internally inconsistent and indicates scope creep (bypass vs respect). Allowing arbitrary URL fetch + proxy/auth + execution privileges increases risk of accessing internal resources or exfiltrating secrets if misused.
Install Mechanism
No install spec is provided (instruction-only), but config.json lists several dependencies (requests, bs4, html2text, optional scrapling and playwright). Because there is no installer, runtime may fail unless these packages are preinstalled. No network download/install from untrusted URLs was found. The optional use of Playwright implies spawning browser processes which is normal for dynamic scraping but increases runtime surface.
Credentials
The skill does not request any environment variables, API keys, or config paths. That is proportionate to its stated purpose. However, SKILL.md allows passing custom auth headers and proxy settings (user-supplied) — users should avoid providing sensitive credentials to the skill unless they trust it.
Persistence & Privilege
always:false (not force-included) and model invocation autonomy is default/expected. The skill does not request modification of other skills or system-wide configs in the provided files. It does request file.write permission which could persist data locally; this is normal for caching/logging but should be noted.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install adaptive-web-analyzer - 安装完成后,直接呼叫该 Skill 的名称或使用
/adaptive-web-analyzer触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Parse the content of the specified webpage and return the main information.
使用示例如下:
基础用法:
分析 https://example.com/article 这篇文章的内容
带参数的高级用法:
使用adaptive-web-analyzer技能抓取 https://api.example.com/data,
方法用api,带上Authorization头,分析类型选detailed,
输出格式要json
处理反爬网站:
抓取 https://protected-site.com/info 使用stealth模式,
提取区域选 .main-content,整理关键信息
元数据
常见问题
adaptive-web-analyzer 是什么?
通过指定接口获取网页内容,自适应抓取解析关键文本,并使用大模型进行智能梳理总结. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 282 次。
如何安装 adaptive-web-analyzer?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install adaptive-web-analyzer」即可一键安装,无需额外配置。
adaptive-web-analyzer 是免费的吗?
是的,adaptive-web-analyzer 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
adaptive-web-analyzer 支持哪些平台?
adaptive-web-analyzer 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 adaptive-web-analyzer?
由 maplee(@maplee)开发并维护,当前版本 v1.0.0。
推荐 Skills