← 返回 Skills 市场

Journal Deep Intel Extractor

Name: Journal Deep Intel Extractor
Author: chenghan66

作者 Chenghan66 · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ✓ 安全检测通过

总下载

当前安装

版本数

在 OpenClaw 中安装

/install journal-intel-extractor

功能描述

专业的学术情报提取工具。支持 Nature/Science/Cell 等全球主流期刊，自动化抓取过去 N 天内新增的 Article 或 Review，并深度提取 PMID 与 Abstract 全文，为 AI 科普总结提供核心数据源。

使用说明 (SKILL.md)

🎓 Journal Deep Intel Intelligence Station

这是一个为医学与生命科学科研人员定制的自动化情报工具。它解决了“只看标题不了解实质内容”的痛点，通过模拟深度访问，为每一篇新文献建立完整的摘要档案。

🌟 核心功能

深度抓取：不同于常规爬虫，本工具会逐一进入 PubMed 详情页提取 Abstract (摘要)。
精准过滤：利用 PubMed 官方 Publication Type 标签，自动剔除新闻、社论和简报，只留硬核干货。
时间窗口监控：基于 [pdat] 逻辑，支持按周或按月生成定制化文献简报。
AI 友好型输出：生成结构化的 JSON 数据，完美适配 OpenClaw 内部的 LLM 总结流程。

🛠️ 技术实现

引擎：基于 Python 3.x 配合 BeautifulSoup4 处理 HTML 解析。
频率控制：内置 0.5s 的抓取延迟（Rate Limiting），保护您的 IP 不被 PubMed 临时封禁。
本地归档：数据自动保存至 ~/Documents/Journal_Intel/ 目录下，按日期和期刊名分类存储。

📖 使用场景示例

场景一：Nature 周报 参数：journal="Nature", type="Article", days=7
场景二：顶级综述追踪 参数：journal="Science", type="Review", days=30

⚠️ 运行提示

由于需要进行深度详情页抓取，运行速度约为 1秒/篇。若当周更新较多（如超过 50 篇），请耐心等待脚本运行结束。

安全使用建议

This skill appears to do what it says: it scrapes PubMed search results and article pages for PMIDs, titles, and abstracts and saves them as JSON in ~/Documents/Journal_Intel/. Before installing or running, consider: (1) The description's phrase “deep access / 全文” may imply retrieving paywalled full text, but this script only fetches PubMed pages/abstracts — if you expect full articles you'll need different code or credentials. (2) Respect PubMed/NLM terms of use and robots.txt; if you plan frequent runs increase the delay or use official APIs (e.g., Entrez E-utilities) to avoid throttling. (3) The SKILL.md assumes a virtualenv (venv/bin/python3) but no install step is provided; you should create a virtualenv and pip install -r requirements.txt before running. (4) The script writes files to your Documents folder — confirm you’re comfortable with that path and disk use, or modify the save location. (5) No credentials or external endpoints are requested, and the code does not exfiltrate data beyond contacting PubMed. If you need stronger guarantees, inspect/modify the script to use Entrez APIs (with an API key) and add explicit error handling and rate limiting.

功能分析

Type: OpenClaw Skill Name: journal-intel-extractor Version: 1.0.0 The skill is a legitimate academic scraping tool designed to extract article abstracts from PubMed. The code in main.py uses standard libraries (requests, BeautifulSoup) to fetch data from a specific domain (pubmed.ncbi.nlm.nih.gov), implements basic rate limiting, and saves the results to a local directory (~/Documents/Journal_Intel) as described in SKILL.md. There are no signs of data exfiltration, malicious execution, or prompt injection.

能力评估

✓ Purpose & Capability

Name/description claim to collect PMIDs and abstracts from major journals; the code queries PubMed and visits PubMed detail pages to extract abstracts and titles. Requested resources (none) match the task. Minor wording mismatch: the README language suggests “deep access” and may be interpreted as retrieving full-text articles, but the implementation only fetches PubMed pages/abstracts.

✓ Instruction Scope

SKILL.md instructs running the Python script with journal/type/days arguments. The runtime behavior is limited to HTTP GETs to pubmed.ncbi.nlm.nih.gov, HTML parsing, and writing a JSON file to ~/Documents/Journal_Intel. The script does not read other files, environment vars, or contact third-party endpoints beyond PubMed.

ℹ Install Mechanism

There is no install spec; the skill is instruction-only but includes requirements.txt and a script. Dependencies (requests, beautifulsoup4, lxml) are reasonable for the task. The SKILL.md entry references venv/bin/python3 but no virtualenv creation step is provided — this is an operational mismatch (not a security issue) you should be aware of.

✓ Credentials

The skill requires no environment variables, no credentials, and does not request unrelated secrets. Network access is only used for PubMed; User-Agent header is hard-coded in the script.

ℹ Persistence & Privilege

The skill writes output files under the user's home Documents folder (~/Documents/Journal_Intel). It is not always-enabled and does not modify other skills or system configuration. Autonomous invocation is allowed (platform default); combined with file writes, consider whether you want the agent to run this without manual review.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install journal-intel-extractor
安装完成后，直接呼叫该 Skill 的名称或使用 /journal-intel-extractor 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

Journal Deep Intel Extractor 1.7.0 introduces deep extraction mode for journal articles. - Now fetches both titles and article abstracts by navigating to PubMed article detail pages. - Provides raw materials (abstracts) for AI-generated lay summaries. - Includes support for filtering by journal name, article type (Article or Review), and days to look back. - Note: Extraction time increases with the number of articles (about 1 second per article), as each abstract is fetched individually.

元数据

Slug journal-intel-extractor

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题