Description

每日科研文献日报生成与管理。当用户请求生成科研日报、更新文献收集脚本或分析研究趋势时激活。支持：(1) 自动抓取 PubMed/bioRxiv/arXiv 最新文献，(2) 语义筛选 AI/生信/病原/真菌等领域，(3) LLM 智能总结与编辑排版，(4) 中文格式报告输出，(5) Zotero 自动录入，(6)...

README (SKILL.md)

每日科研文献日报

Name: Literature Daily Report
Author: biociao

🎯 核心功能

自动化生成 生命科学×AI 交叉领域 的每日科研文献日报，包含：

📊 文献采集 - 实时抓取 PubMed, bioRxiv, arXiv 当天最新发表
🔍 智能筛选 - 语义检索 AI/生信/病原/真菌等关注领域
🤖 LLM 总结 - 自动生成 100-250 字核心摘要（中英文混排）
📰 编辑排版 - 前言 + 重点推荐 + 完整列表 + 趋势总结

🚀 快速开始

生成今日日报

# 方式 1：直接使用脚本
~/.openclaw/skills/literature-daily-report/scripts/literature_collector.py

# 方式 2：通过项目目录运行
cd ~/.openclaw/workspace/projects/literature-collector && ./run.sh

自定义配置

编辑配置文件 config.yaml：

output_dir: ~/.openclaw/workspace/literature
search_queries:  # 搜索关键词组合
  - "metagenomics AND machine learning"
  - "fungal pathogen AND bioinformatics"
  - "single-cell AND deep learning"
high_impact_journals:
  - Nature
  - Cell
  - Science
  - Bioinformatics

🔧 工作流程

Step 1: 文献采集 (`fetch_articles_with_abstracts`)

从三大数据库获取最新文献及摘要：

来源	API	查询范围	特点
PubMed	EUtils	最近 1 天	已发表论文，有 PMID
bioRxiv	REST API	最近 1 天	预印本，DOI 格式
arXiv	Export API	最近 1 天	CS/Q-Bio 类别

执行逻辑：

for query in SEARCH_QUERIES:
    articles = fetch_pubmed(query)      # PubMed
    biorxiv = fetch_biorxiv()           # bioRxiv  
    arxiv = fetch_arxiv()               # arXiv
    all_articles.extend(...)

Step 2: 语义筛选 (`categorize_article`)

根据关键词匹配筛选目标领域：

CATEGORY_KEYWORDS = {
    "单细胞组学": ["single-cell", "scRNA-seq", "spatial transcriptomics"],
    "宏基因组学": ["metagenomics", "microbiome", "16S"],
    "病原真菌": ["fungal", "pathogen", "Candida", "Aspergillus"],
    "生信方法": ["bioinformatics", "algorithm", "pipeline", "tool"],
    "AI/ML": ["machine learning", "deep learning", "transformer"],
    "基因组学": ["genomics", "genome", "pan-genome"],
}

优先级评分：

高影响力期刊（Nature/Cell/Science）+10 分
方法开发类文章 +5 分
热门话题（单细胞/AI）+3 分

Step 3: LLM 总结 (`generate_summary`)

基于摘要生成结构化中文总结：

【研究目的】
【样本与方法】
【研究结果】
【创新性】

优化要点：

识别英文摘要 → 添加 【】 标签包裹各部分
长度控制在 5000 字以内
提取关键句子而非简单截断

Step 4: 编辑排版 (`generate_mark_report_v2`)

生成完整的 Markdown 日报结构：

# 📚 每日文献速递 - YYYY-MM-DD

## 📰 编辑前言
- 日期统计
- 来源分布
- 热点领域概览

## ⭐ 重点推荐 (8 篇)
- 带标签分类
- 结构化摘要
- DOI 链接

## 📖 完整文献列表
- 按来源分组
- 详细元数据

## 📝 编辑总结
- 今日趋势分析
- 编者点评
- 关注建议

📂 文件组织

literature-daily-report/
├── SKILL.md                    # 本说明文档
├── scripts/
│   └── literature_collector.py # 主收集脚本
└── references/
    ├── categories.md           # 领域分类标准
    ├── workflows.md            # 工作流指南
    └── api_docs.md             # API 参考文档

# 输出目录
~/.openclaw/workspace/literature/
├── literature-YYYY-MM-DD.md   # 当日报告
└── latest.md                   # 最新报告索引

# ClawLib 同步目录
~/.openclaw/workspace/ClawLib/科研日报/
├── literature-YYYY-MM-DD.md   # 当日报告（自动同步）
└── latest.md                   # 最新报告（自动同步）

🔄 自动同步（v3.3 全套链路）

生成日报后会自动同步到多个目标，一气呵成：

📝 本地报告 → ~/.openclaw/workspace/literature/literature-{date}.md
📚 ClawLib → ~/.openclaw/workspace/ClawLib/科研日报/
🔍 Zotero → BioCiaoLab Group Library（按分类收藏夹）
🧠 知识图谱 → research/literature-{date} 实体

📚 Zotero 集成 (v3.3)

文献收集完成后会自动录入 BioCiaoLab Group Library：

功能特点：

PubMed 文献：通过 PMID 自动添加
bioRxiv/arXiv 预印本：通过 DOI 自动添加
自动去重（已存在的文献会跳过）
添加标签：literature-daily + 日期 + 分类（如 单细胞组学, 宏基因组学）
按文献归属类别添加到对应的收藏夹（单细胞组学/宏基因组学/病原真菌/AI+ML/生信方法/基因组学等）

环境变量配置（写入 ~/.zshrc）：

export ZOTERO_API_KEY="your-api-key"
export ZOTERO_USER_ID="your-user-id"       # Zotero 用户 ID
export ZOTERO_GROUP_ID="your-group-id"     # BioCiaoLab Group ID (默认 6489333)

⚠️ 脚本会自动从 ~/.zshrc 加载环境变量，无需手动 export。

获取方式：

API Key: https://www.zotero.org/settings/keys/new
User ID: https://www.zotero.org/settings
Group ID: 访问 BioCiaoLab 小组页面，URL 中的数字部分即为 Group ID

🧠 知识图谱同步 (v3.3)

文献收集完成后会自动同步到 OpenClaw 知识图谱：

功能特点：

每篇文献作为一条 fact 写入 research/literature-YYYY-MM-DD 实体
fact 包含：分类标签、标题、作者、期刊、来源、PMID/URL
调用 kg.py add 逐条写入知识图谱

写入位置：

entity: research/literature-\x3C日期>
category: 文献主分类（单细胞组学/宏基因组学/病原真菌/AI+ML/生信方法/基因组学等）
source: literature-daily-report

🌟 高级用法

自定义搜索领域

在 SEARCH_QUERIES 中添加新主题：

SEARCH_QUERIES = [
    '(epigenetics[Title/Abstract]) AND ((deep learning[Title/Abstract]))',
    '(CAR-T[Title/Abstract]) AND ((single-cell[Title/Abstract]))',
]

调整输出格式

修改 generate_mark_report_v2() 中的标题层级和标签风格。

定时任务

添加 crontab 自动发送日报：

# 每天早上 9 点生成并发送到飞书
0 1 * * * cd ~/.openclaw/workspace/projects/literature-collector && ./run.sh

🔄 迭代历史

v1.0: 基础采集（标题 + 来源）
v2.0: 增加摘要获取 + 智能总结
v3.0: 中文格式优化 + 编辑结构化排版
v3.1: 自动同步到 ClawLib 科研日报目录
v3.2: 新增 Zotero 自动录入功能（PMID/DOI 自动添加）
v3.3: 新增知识图谱同步；Zotero 环境变量自动从 ~/.zshrc 加载

📞 技术支持

遇到问题？检查：

API 限流 → 增加 time.sleep() 延时
无结果 → 扩展搜索关键词范围
格式异常 → 查看 literature/abstract_cache.json 缓存状态

Usage Guidance

Before installing or running this skill: 1) Treat it as requiring Zotero credentials even though the metadata omits them — do not place secrets in ~/.zshrc unless you intend to share them. Prefer setting ZOTERO_API_KEY in a restricted environment and verify how the script reads it. 2) Inspect the full literature_collector.py (and any zotero.py / kg.py it calls) for subprocess calls and network endpoints — those helper scripts (paths under ~/.openclaw/workspace/skills/...) could upload data to external services. 3) Verify the default ZOTERO_GROUP_ID (6489333) — if the script uploads to that group by default, you may be sending items to a third‑party collection. 4) If you want to proceed, run the script in a sandbox or isolated environment first, with minimal credentials and with network access limited, and monitor exactly what network requests it makes. 5) Ask the publisher/owner for provenance (homepage, source repo) or request the skill declare its required env vars and endpoints in metadata; lack of provenance combined with automatic uploading behavior is the primary reason this is suspicious.

Capability Analysis

Type: OpenClaw Skill Name: biociao-literature-daily-report Version: 1.1.0 The skill bundle provides a functional tool for scientific literature management but includes high-risk behaviors in `scripts/literature_collector.py`. Specifically, the script executes a shell command (`bash -l -c "env"`) to scrape the user's entire environment for credentials and uses `subprocess.run` to execute external scripts (`zotero.py`, `kg.py`). While these actions are plausibly needed for the stated purpose of syncing data to Zotero and a Knowledge Graph, they represent a broad attack surface for credential harvesting and arbitrary code execution. The script also makes numerous network calls to scientific APIs (PubMed, bioRxiv, arXiv) and interacts with a Zotero group (ID: 6489333).

Capability Assessment

⚠ Purpose & Capability

The skill claims no required environment variables or binaries in the metadata, but the SKILL.md asks users to place Zotero credentials in ~/.zshrc and the included script references Zotero and knowledge‑graph helper scripts. Requesting Zotero API keys is consistent with the stated Zotero integration, but the metadata omission (no declared env vars) is an incoherence: the skill actually needs credentials but does not declare them.

⚠ Instruction Scope

SKILL.md instructs the agent/script to automatically load environment variables from ~/.zshrc, create reports under ~/.openclaw/workspace, sync to a ClawLib directory, push to a Zotero group (BioCiaoLab default group id), and call a knowledge‑graph CLI (kg.py add). Those actions go beyond simple fetching/summarization because they perform writes and network uploads and rely on other scripts (zotero.py, kg.py) at paths under the workspace. The instructions also tell users to write secrets into ~/.zshrc and promise automatic loading — Python code doesn't automatically source shell rc files, so this is inconsistent and could lead to surprising credential access or accidental exfiltration.

✓ Install Mechanism

This is instruction‑only with a bundled Python script; there is no external install/download step or remote archive. That lowers supply‑chain installation risk. However, the script invokes other local helper scripts (zotero.py, kg.py) via fixed paths which could execute arbitrary code if those helper scripts are present or installed later.

⚠ Credentials

The SKILL.md requires sensitive environment variables (ZOTERO_API_KEY, ZOTERO_USER_ID, ZOTERO_GROUP_ID) but the skill metadata did not declare any required env vars or primary credential. The code defaults ZOTERO_GROUP_ID to a specific group id (BioCiaoLab) which means, if the code invokes a Zotero upload helper, collected metadata could be added to that group by default. The script also reads os.environ and may call other scripts that themselves use credentials — overall the credential handling is under‑declared and therefore disproportionate to the metadata.

ℹ Persistence & Privilege

The skill is not always:true and does not claim elevated platform privileges. It writes files into the user's workspace and ClawLib directories and calls local helper scripts; these behaviors are within the expected scope for a sync/reporting tool but do require the skill to be allowed file and network access. There's no explicit modification of other skills' configs, but dependence on workspace helper scripts increases the attack surface (those scripts could be swapped to exfiltrate data).

Version History

v1.1.0

v3.3: 新增知识图谱同步；Zotero 环境变量自动从 ~/.zshrc 加载；全套同步链路（ClawLib + Zotero + 知识图谱）

Metadata

Slug biociao-literature-daily-report

Version 1.1.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Literature Daily Report?

每日科研文献日报生成与管理。当用户请求生成科研日报、更新文献收集脚本或分析研究趋势时激活。支持：(1) 自动抓取 PubMed/bioRxiv/arXiv 最新文献，(2) 语义筛选 AI/生信/病原/真菌等领域，(3) LLM 智能总结与编辑排版，(4) 中文格式报告输出，(5) Zotero 自动录入，(6)... It is an AI Agent Skill for Claude Code / OpenClaw, with 81 downloads so far.

How do I install Literature Daily Report?

Run "/install biociao-literature-daily-report" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Literature Daily Report free?

Yes, Literature Daily Report is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Literature Daily Report support?

Literature Daily Report is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Literature Daily Report?

It is built and maintained by Fang, Chao (@biociao); the current version is v1.1.0.

More Skills

Literature Daily Report