功能描述

Automated daily literature search system for academic researchers. Performs scheduled searches across PubMed, OpenAlex, and Semantic Scholar with automatic d...

使用说明 (SKILL.md)

Daily Literature Search Skill

Name: Daily Literature Search
Author: wzr101622

Automated literature search system for academic researchers. Performs scheduled searches across multiple databases (PubMed, OpenAlex, Semantic Scholar), automatically deduplicates results, downloads open-access papers, and generates daily reports.

🎯 Use Cases

Daily literature monitoring for specific research topics
Automated paper collection for literature reviews
Stay updated on latest publications in your field
Build personal paper library with automatic categorization

📦 Components

1. Core Search Script (`daily_literature_search.py`)

Main execution script with the following features:

Multi-source search: PubMed, OpenAlex, Semantic Scholar
Automatic deduplication: By DOI (within batch + against local library)
OA detection: Uses Unpaywall API to identify open-access papers
Auto-download: Downloads OA papers from PubMed Central or publisher sites
Smart categorization: Classifies papers by topic (configurable keywords)
Daily reports: Generates Markdown reports with search statistics

2. Upload Analyzer (`analyze_uploaded.py`)

Analyzes and categorizes manually uploaded papers:

Filename-based classification: Uses keyword matching
DOI extraction: From filenames and metadata
Batch processing: Handles multiple files at once
Report generation: Creates categorization summary

⚙️ Configuration

Directory Structure

papers/
├── B-ALL/raw/          # Category 1 (e.g., B-ALL research)
├── MM/raw/             # Category 2 (e.g., Multiple Myeloma)
├── OTHER/raw/          # Other papers
├── daily_search_logs/  # Search logs and reports
└── upload_temp/        # Temporary upload directory

Search Keywords (Customizable)

Edit SEARCH_KEYWORDS in daily_literature_search.py:

SEARCH_KEYWORDS = [
    '"inotuzumab ozogamicin"',
    '"Elranatamab"',
    '"Teclistamab"',
    '"Talquetamab"',
    '"Blinatumomab"',
    '("CAR-T" AND "B-ALL")',
]

Classification Keywords

Edit B_ALL_KEYWORDS and MM_KEYWORDS in analyze_uploaded.py to match your research domains.

🚀 Usage

Manual Execution

# Run daily search
python3 papers/daily_literature_search.py

# Analyze uploaded papers
python3 papers/analyze_uploaded.py

Scheduled Execution (Cron)

Add to crontab for automatic daily searches:

# Daily search at 6:30 AM
30 6 * * * /usr/bin/python3 /path/to/papers/daily_literature_search.py >> /path/to/papers/daily_search_logs/cron.log 2>&1

Configuration Options

Parameter	Default	Description
`MAX_RESULTS_PER_KEYWORD`	10	Max results per keyword per source
`DATE_RANGE_DAYS`	7	Search window (recent N days)
`SOURCES`	`["pm", "oa", "s2"]`	Search databases
`USER_EMAIL`	—	For polite API access (env var)

📊 Output

Daily Report Example

# 📚 每日文献检索报告
**检索日期：** 2026-03-18

## 📊 检索汇总
| 分类 | 检索到 | 成功下载 | 付费墙 |
|------|--------|---------|--------|
| B-ALL | 28 | 0 | 28 |
| MM | 24 | 0 | 24 |
| 总计 | 53 | 0 | 53 |

## 🔀 去重统计
- 原始检索结果：130 篇
- 去重后文献：110 篇
- 批次内重复：2 篇
- 库中已有：18 篇

File Organization

Reports: papers/daily_search_logs/daily_report_YYYY-MM-DD.md
Logs: papers/daily_search_logs/daily_search_YYYY-MM-DD.log
Papers: papers/{CATEGORY}/raw/{DOI}.pdf

🔧 Advanced Features

1. Library Deduplication

Automatically checks new results against existing library:

Scans all category directories for existing DOIs
Extracts DOIs from filenames and historical logs
Skips papers already in library
Reports duplicate statistics

2. Open Access Detection

Uses Unpaywall API to identify OA papers:

is_oa, oa_url = check_open_access(doi)
if is_oa:
    download_paper(oa_url, save_path)

3. PubMed Central Integration

Automatically tries PMC for biomedical papers:

if pmid and str(pmid).isdigit():
    download_from_pubmed(pmid, save_path)

🛠️ Customization Guide

Change Research Topics

Edit SEARCH_KEYWORDS in daily_literature_search.py
Update category names and keywords
Modify directory structure if needed

Add New Categories

Create new directory: papers/NEW_CATEGORY/raw/
Add classification keywords in classify_paper() function
Update report generation to include new category

Integrate with Notification Systems

Add email/Slack/Discord notifications after search completion:

# At end of main()
send_notification(f"Daily search complete: {results['total']} papers found")

📋 Requirements

Python Dependencies

pip install requests
# Most other modules are standard library

API Access (Optional but Recommended)

Semantic Scholar API Key: Higher rate limits
OpenAlex API Key: Polite pool access
Unpaywall: Free, no key needed (email required)

Set environment variables:

export SEMANTIC_SCHOLAR_API_KEY="your-key"
export OPENALEX_API_KEY="your-key"
export USER_EMAIL="[email protected]"

⚠️ Important Notes

Rate Limits: Respect API rate limits, especially without API keys
Storage: Monitor disk space for downloaded PDFs
Copyright: Only download open-access or legally available papers
Email: Set USER_EMAIL for polite API access

🔄 Version History

1.0.0 (2026-03-18): Initial release
- Multi-source search (PubMed, OpenAlex, Semantic Scholar)
- Automatic deduplication (batch + library)
- OA detection and download
- Smart categorization
- Daily reports with statistics

🤝 Contributing

To contribute improvements:

Fork the skill repository
Test changes with your own literature search
Submit pull request with description of improvements

📄 License

This skill is provided as-is for academic research purposes. Users are responsible for compliance with publisher terms and copyright laws.

安全使用建议

What to check before installing: - Metadata mismatch: registry shows no required env vars but SKILL.md and install.sh expect USER_EMAIL and optional API keys and notification creds. Confirm which env vars you must set and why. - Review install.sh before running. It will create a .env with placeholders, make directories under your home workspace, install Python packages with pip3, and add a cron job. Run with --dry-run first and inspect the produced .env and config files. - Inspect/verify the 'literature-review' skill (workspace/skills/literature-review) or ensure the referenced script exists and is trusted: the main script executes that script via subprocess, so any code there will run with your user privileges when this job runs. - Be cautious with notification webhooks and email credentials: if you fill those in, the skill may send data (e.g., counts or filenames) externally. Only provide webhook or SMTP credentials to endpoints you trust. - The code has at least one clear bug (a truncated variable name in deduplication) that will likely cause runtime errors; consider running tests (pytest) and doing a dry-run before enabling cron. - If you are not comfortable reviewing code, do not install on a sensitive system. Install into an isolated environment or VM, run manually once, verify behavior and network calls, then enable scheduled runs. If you want, I can: (1) point out the exact lines with the bug(s), (2) generate a safe dry-run checklist of commands to run, or (3) craft a minimal config.yaml and .env for a non-network test run.

功能分析

Type: OpenClaw Skill Name: daily-literature-search Version: 1.0.0 The skill bundle is a legitimate tool for automated academic literature retrieval from PubMed, OpenAlex, and Semantic Scholar. It includes scripts for searching, deduplicating by DOI, and downloading open-access papers using the Unpaywall API. The install.sh script sets up a cron job for daily automation as described in the documentation. No evidence of malicious intent, data exfiltration, or unauthorized execution was found; the code logic aligns perfectly with the stated purpose of a research automation tool.

能力评估

ℹ Purpose & Capability

Name/description (daily literature searches, OA download, classification) match the included scripts (search, dedupe, classify, upload analyzer). However the package metadata/registry claims no required environment variables while SKILL.md and install.sh expect USER_EMAIL and optional API keys and notification creds — this is an inconsistency that should be clarified.

⚠ Instruction Scope

Runtime instructions and scripts perform network searches, OA checks, and downloads (expected). Notably, the main script invokes an external script via subprocess: papers_dir.parent/skills/literature-review/scripts/lit_search.py — that means this skill will execute code from another 'literature-review' skill located in the workspace, so you must trust whatever lives there. The installer writes a .env with notification webhooks and optional SMTP creds; if you populate these, notifications could send data externally. No evidence of hidden endpoints, but the ability to call arbitrary webhook(s) is present if configured.

ℹ Install Mechanism

There is no registry install spec, but an install.sh is provided that copies config.example.yaml, creates directories, writes a .env file, installs Python deps via pip3, and adds a cron job. The install script uses eval for dry-run execution and crontab modification; running it will write files and schedule cron — review it before running. The origin is 'unknown' and there's no homepage link.

⚠ Credentials

Registry metadata lists no required env vars, but SKILL.md and install.sh expect USER_EMAIL and optionally SEMANTIC_SCHOLAR_API_KEY and OPENALEX_API_KEY, plus optional EMAIL_USERNAME/EMAIL_PASSWORD and webhook variables. Requesting email/webhook credentials is reasonable for notifications, but because the registry failed to declare these, there's a mismatch that reduces transparency. Only provide notification/email credentials if you trust the code and wish to enable notifications.

ℹ Persistence & Privilege

The skill is not 'always:true' and allows autonomous invocation (default). The installer writes config, .env, and adds a cron job (persistent presence on the host) — that is expected for scheduled tasks, but you should be aware this makes the skill persistently scheduled to run daily until uninstalled.

版本历史

v1.0.0

Automated daily literature search for academic researchers—initial release. - Searches PubMed, OpenAlex, and Semantic Scholar with configurable keywords - Automatic deduplication by DOI (within batch and against local library) - Detects and downloads open-access papers, using Unpaywall and PubMed Central integration - Smart, configurable topic categorization for collected papers - Generates daily Markdown reports and logs - Includes upload analyzer for organizing manually added papers

元数据

Slug daily-literature-search

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

Daily Literature Search 是什么？

Automated daily literature search system for academic researchers. Performs scheduled searches across PubMed, OpenAlex, and Semantic Scholar with automatic d... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 239 次。

如何安装 Daily Literature Search？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install daily-literature-search」即可一键安装，无需额外配置。

Daily Literature Search 是免费的吗？

是的，Daily Literature Search 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

Daily Literature Search 支持哪些平台？

Daily Literature Search 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Daily Literature Search？

由 Wzr101622（@wzr101622）开发并维护，当前版本 v1.0.0。

Daily Literature Search