← 返回 Skills 市场
chenghan66

Journal Deep Intel Extractor

作者 Chenghan66 · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ 安全检测通过
99
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install journal-intel-extractor
功能描述
专业的学术情报提取工具。支持 Nature/Science/Cell 等全球主流期刊,自动化抓取过去 N 天内新增的 Article 或 Review,并深度提取 PMID 与 Abstract 全文,为 AI 科普总结提供核心数据源。
使用说明 (SKILL.md)

🎓 Journal Deep Intel Intelligence Station

这是一个为医学与生命科学科研人员定制的自动化情报工具。它解决了“只看标题不了解实质内容”的痛点,通过模拟深度访问,为每一篇新文献建立完整的摘要档案。

🌟 核心功能

  • 深度抓取:不同于常规爬虫,本工具会逐一进入 PubMed 详情页提取 Abstract (摘要)
  • 精准过滤:利用 PubMed 官方 Publication Type 标签,自动剔除新闻、社论和简报,只留硬核干货。
  • 时间窗口监控:基于 [pdat] 逻辑,支持按周或按月生成定制化文献简报。
  • AI 友好型输出:生成结构化的 JSON 数据,完美适配 OpenClaw 内部的 LLM 总结流程。

🛠️ 技术实现

  1. 引擎:基于 Python 3.x 配合 BeautifulSoup4 处理 HTML 解析。
  2. 频率控制:内置 0.5s 的抓取延迟(Rate Limiting),保护您的 IP 不被 PubMed 临时封禁。
  3. 本地归档:数据自动保存至 ~/Documents/Journal_Intel/ 目录下,按日期和期刊名分类存储。

📖 使用场景示例

  • 场景一:Nature 周报 参数:journal="Nature", type="Article", days=7
  • 场景二:顶级综述追踪 参数:journal="Science", type="Review", days=30

⚠️ 运行提示

由于需要进行深度详情页抓取,运行速度约为 1秒/篇。若当周更新较多(如超过 50 篇),请耐心等待脚本运行结束。

安全使用建议
This skill appears to do what it says: it scrapes PubMed search results and article pages for PMIDs, titles, and abstracts and saves them as JSON in ~/Documents/Journal_Intel/. Before installing or running, consider: (1) The description's phrase “deep access / 全文” may imply retrieving paywalled full text, but this script only fetches PubMed pages/abstracts — if you expect full articles you'll need different code or credentials. (2) Respect PubMed/NLM terms of use and robots.txt; if you plan frequent runs increase the delay or use official APIs (e.g., Entrez E-utilities) to avoid throttling. (3) The SKILL.md assumes a virtualenv (venv/bin/python3) but no install step is provided; you should create a virtualenv and pip install -r requirements.txt before running. (4) The script writes files to your Documents folder — confirm you’re comfortable with that path and disk use, or modify the save location. (5) No credentials or external endpoints are requested, and the code does not exfiltrate data beyond contacting PubMed. If you need stronger guarantees, inspect/modify the script to use Entrez APIs (with an API key) and add explicit error handling and rate limiting.
功能分析
Type: OpenClaw Skill Name: journal-intel-extractor Version: 1.0.0 The skill is a legitimate academic scraping tool designed to extract article abstracts from PubMed. The code in main.py uses standard libraries (requests, BeautifulSoup) to fetch data from a specific domain (pubmed.ncbi.nlm.nih.gov), implements basic rate limiting, and saves the results to a local directory (~/Documents/Journal_Intel) as described in SKILL.md. There are no signs of data exfiltration, malicious execution, or prompt injection.
能力评估
Purpose & Capability
Name/description claim to collect PMIDs and abstracts from major journals; the code queries PubMed and visits PubMed detail pages to extract abstracts and titles. Requested resources (none) match the task. Minor wording mismatch: the README language suggests “deep access” and may be interpreted as retrieving full-text articles, but the implementation only fetches PubMed pages/abstracts.
Instruction Scope
SKILL.md instructs running the Python script with journal/type/days arguments. The runtime behavior is limited to HTTP GETs to pubmed.ncbi.nlm.nih.gov, HTML parsing, and writing a JSON file to ~/Documents/Journal_Intel. The script does not read other files, environment vars, or contact third-party endpoints beyond PubMed.
Install Mechanism
There is no install spec; the skill is instruction-only but includes requirements.txt and a script. Dependencies (requests, beautifulsoup4, lxml) are reasonable for the task. The SKILL.md entry references venv/bin/python3 but no virtualenv creation step is provided — this is an operational mismatch (not a security issue) you should be aware of.
Credentials
The skill requires no environment variables, no credentials, and does not request unrelated secrets. Network access is only used for PubMed; User-Agent header is hard-coded in the script.
Persistence & Privilege
The skill writes output files under the user's home Documents folder (~/Documents/Journal_Intel). It is not always-enabled and does not modify other skills or system configuration. Autonomous invocation is allowed (platform default); combined with file writes, consider whether you want the agent to run this without manual review.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install journal-intel-extractor
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /journal-intel-extractor 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Journal Deep Intel Extractor 1.7.0 introduces deep extraction mode for journal articles. - Now fetches both titles and article abstracts by navigating to PubMed article detail pages. - Provides raw materials (abstracts) for AI-generated lay summaries. - Includes support for filtering by journal name, article type (Article or Review), and days to look back. - Note: Extraction time increases with the number of articles (about 1 second per article), as each abstract is fetched individually.
元数据
Slug journal-intel-extractor
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

Journal Deep Intel Extractor 是什么?

专业的学术情报提取工具。支持 Nature/Science/Cell 等全球主流期刊,自动化抓取过去 N 天内新增的 Article 或 Review,并深度提取 PMID 与 Abstract 全文,为 AI 科普总结提供核心数据源。 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 99 次。

如何安装 Journal Deep Intel Extractor?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install journal-intel-extractor」即可一键安装,无需额外配置。

Journal Deep Intel Extractor 是免费的吗?

是的,Journal Deep Intel Extractor 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Journal Deep Intel Extractor 支持哪些平台?

Journal Deep Intel Extractor 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Journal Deep Intel Extractor?

由 Chenghan66(@chenghan66)开发并维护,当前版本 v1.0.0。

💬 留言讨论