Description

Automated daily literature search system for academic researchers. Performs scheduled searches across PubMed, OpenAlex, and Semantic Scholar with automatic d...

README (SKILL.md)

Daily Literature Search Skill

Name: Daily Literature Search
Author: wzr101622

Automated literature search system for academic researchers. Performs scheduled searches across multiple databases (PubMed, OpenAlex, Semantic Scholar), automatically deduplicates results, downloads open-access papers, and generates daily reports.

🎯 Use Cases

Daily literature monitoring for specific research topics
Automated paper collection for literature reviews
Stay updated on latest publications in your field
Build personal paper library with automatic categorization

📦 Components

1. Core Search Script (`daily_literature_search.py`)

Main execution script with the following features:

Multi-source search: PubMed, OpenAlex, Semantic Scholar
Automatic deduplication: By DOI (within batch + against local library)
OA detection: Uses Unpaywall API to identify open-access papers
Auto-download: Downloads OA papers from PubMed Central or publisher sites
Smart categorization: Classifies papers by topic (configurable keywords)
Daily reports: Generates Markdown reports with search statistics

2. Upload Analyzer (`analyze_uploaded.py`)

Analyzes and categorizes manually uploaded papers:

Filename-based classification: Uses keyword matching
DOI extraction: From filenames and metadata
Batch processing: Handles multiple files at once
Report generation: Creates categorization summary

⚙️ Configuration

Directory Structure

papers/
├── B-ALL/raw/          # Category 1 (e.g., B-ALL research)
├── MM/raw/             # Category 2 (e.g., Multiple Myeloma)
├── OTHER/raw/          # Other papers
├── daily_search_logs/  # Search logs and reports
└── upload_temp/        # Temporary upload directory

Search Keywords (Customizable)

Edit SEARCH_KEYWORDS in daily_literature_search.py:

SEARCH_KEYWORDS = [
    '"inotuzumab ozogamicin"',
    '"Elranatamab"',
    '"Teclistamab"',
    '"Talquetamab"',
    '"Blinatumomab"',
    '("CAR-T" AND "B-ALL")',
]

Classification Keywords

Edit B_ALL_KEYWORDS and MM_KEYWORDS in analyze_uploaded.py to match your research domains.

🚀 Usage

Manual Execution

# Run daily search
python3 papers/daily_literature_search.py

# Analyze uploaded papers
python3 papers/analyze_uploaded.py

Scheduled Execution (Cron)

Add to crontab for automatic daily searches:

# Daily search at 6:30 AM
30 6 * * * /usr/bin/python3 /path/to/papers/daily_literature_search.py >> /path/to/papers/daily_search_logs/cron.log 2>&1

Configuration Options

Parameter	Default	Description
`MAX_RESULTS_PER_KEYWORD`	10	Max results per keyword per source
`DATE_RANGE_DAYS`	7	Search window (recent N days)
`SOURCES`	`["pm", "oa", "s2"]`	Search databases
`USER_EMAIL`	—	For polite API access (env var)

📊 Output

Daily Report Example

# 📚 每日文献检索报告
**检索日期：** 2026-03-18

## 📊 检索汇总
| 分类 | 检索到 | 成功下载 | 付费墙 |
|------|--------|---------|--------|
| B-ALL | 28 | 0 | 28 |
| MM | 24 | 0 | 24 |
| 总计 | 53 | 0 | 53 |

## 🔀 去重统计
- 原始检索结果：130 篇
- 去重后文献：110 篇
- 批次内重复：2 篇
- 库中已有：18 篇

File Organization

Reports: papers/daily_search_logs/daily_report_YYYY-MM-DD.md
Logs: papers/daily_search_logs/daily_search_YYYY-MM-DD.log
Papers: papers/{CATEGORY}/raw/{DOI}.pdf

🔧 Advanced Features

1. Library Deduplication

Automatically checks new results against existing library:

Scans all category directories for existing DOIs
Extracts DOIs from filenames and historical logs
Skips papers already in library
Reports duplicate statistics

2. Open Access Detection

Uses Unpaywall API to identify OA papers:

is_oa, oa_url = check_open_access(doi)
if is_oa:
    download_paper(oa_url, save_path)

3. PubMed Central Integration

Automatically tries PMC for biomedical papers:

if pmid and str(pmid).isdigit():
    download_from_pubmed(pmid, save_path)

🛠️ Customization Guide

Change Research Topics

Edit SEARCH_KEYWORDS in daily_literature_search.py
Update category names and keywords
Modify directory structure if needed

Add New Categories

Create new directory: papers/NEW_CATEGORY/raw/
Add classification keywords in classify_paper() function
Update report generation to include new category

Integrate with Notification Systems

Add email/Slack/Discord notifications after search completion:

# At end of main()
send_notification(f"Daily search complete: {results['total']} papers found")

📋 Requirements

Python Dependencies

pip install requests
# Most other modules are standard library

API Access (Optional but Recommended)

Semantic Scholar API Key: Higher rate limits
OpenAlex API Key: Polite pool access
Unpaywall: Free, no key needed (email required)

Set environment variables:

export SEMANTIC_SCHOLAR_API_KEY="your-key"
export OPENALEX_API_KEY="your-key"
export USER_EMAIL="[email protected]"

⚠️ Important Notes

Rate Limits: Respect API rate limits, especially without API keys
Storage: Monitor disk space for downloaded PDFs
Copyright: Only download open-access or legally available papers
Email: Set USER_EMAIL for polite API access

🔄 Version History

1.0.0 (2026-03-18): Initial release
- Multi-source search (PubMed, OpenAlex, Semantic Scholar)
- Automatic deduplication (batch + library)
- OA detection and download
- Smart categorization
- Daily reports with statistics

🤝 Contributing

To contribute improvements:

Fork the skill repository
Test changes with your own literature search
Submit pull request with description of improvements

📄 License

This skill is provided as-is for academic research purposes. Users are responsible for compliance with publisher terms and copyright laws.

Usage Guidance

What to check before installing: - Metadata mismatch: registry shows no required env vars but SKILL.md and install.sh expect USER_EMAIL and optional API keys and notification creds. Confirm which env vars you must set and why. - Review install.sh before running. It will create a .env with placeholders, make directories under your home workspace, install Python packages with pip3, and add a cron job. Run with --dry-run first and inspect the produced .env and config files. - Inspect/verify the 'literature-review' skill (workspace/skills/literature-review) or ensure the referenced script exists and is trusted: the main script executes that script via subprocess, so any code there will run with your user privileges when this job runs. - Be cautious with notification webhooks and email credentials: if you fill those in, the skill may send data (e.g., counts or filenames) externally. Only provide webhook or SMTP credentials to endpoints you trust. - The code has at least one clear bug (a truncated variable name in deduplication) that will likely cause runtime errors; consider running tests (pytest) and doing a dry-run before enabling cron. - If you are not comfortable reviewing code, do not install on a sensitive system. Install into an isolated environment or VM, run manually once, verify behavior and network calls, then enable scheduled runs. If you want, I can: (1) point out the exact lines with the bug(s), (2) generate a safe dry-run checklist of commands to run, or (3) craft a minimal config.yaml and .env for a non-network test run.

Capability Analysis

Type: OpenClaw Skill Name: daily-literature-search Version: 1.0.0 The skill bundle is a legitimate tool for automated academic literature retrieval from PubMed, OpenAlex, and Semantic Scholar. It includes scripts for searching, deduplicating by DOI, and downloading open-access papers using the Unpaywall API. The install.sh script sets up a cron job for daily automation as described in the documentation. No evidence of malicious intent, data exfiltration, or unauthorized execution was found; the code logic aligns perfectly with the stated purpose of a research automation tool.

Capability Assessment

ℹ Purpose & Capability

Name/description (daily literature searches, OA download, classification) match the included scripts (search, dedupe, classify, upload analyzer). However the package metadata/registry claims no required environment variables while SKILL.md and install.sh expect USER_EMAIL and optional API keys and notification creds — this is an inconsistency that should be clarified.

⚠ Instruction Scope

Runtime instructions and scripts perform network searches, OA checks, and downloads (expected). Notably, the main script invokes an external script via subprocess: papers_dir.parent/skills/literature-review/scripts/lit_search.py — that means this skill will execute code from another 'literature-review' skill located in the workspace, so you must trust whatever lives there. The installer writes a .env with notification webhooks and optional SMTP creds; if you populate these, notifications could send data externally. No evidence of hidden endpoints, but the ability to call arbitrary webhook(s) is present if configured.

ℹ Install Mechanism

There is no registry install spec, but an install.sh is provided that copies config.example.yaml, creates directories, writes a .env file, installs Python deps via pip3, and adds a cron job. The install script uses eval for dry-run execution and crontab modification; running it will write files and schedule cron — review it before running. The origin is 'unknown' and there's no homepage link.

⚠ Credentials

Registry metadata lists no required env vars, but SKILL.md and install.sh expect USER_EMAIL and optionally SEMANTIC_SCHOLAR_API_KEY and OPENALEX_API_KEY, plus optional EMAIL_USERNAME/EMAIL_PASSWORD and webhook variables. Requesting email/webhook credentials is reasonable for notifications, but because the registry failed to declare these, there's a mismatch that reduces transparency. Only provide notification/email credentials if you trust the code and wish to enable notifications.

ℹ Persistence & Privilege

The skill is not 'always:true' and allows autonomous invocation (default). The installer writes config, .env, and adds a cron job (persistent presence on the host) — that is expected for scheduled tasks, but you should be aware this makes the skill persistently scheduled to run daily until uninstalled.

Version History

v1.0.0

Automated daily literature search for academic researchers—initial release. - Searches PubMed, OpenAlex, and Semantic Scholar with configurable keywords - Automatic deduplication by DOI (within batch and against local library) - Detects and downloads open-access papers, using Unpaywall and PubMed Central integration - Smart, configurable topic categorization for collected papers - Generates daily Markdown reports and logs - Includes upload analyzer for organizing manually added papers

Metadata

Slug daily-literature-search

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Daily Literature Search?

Automated daily literature search system for academic researchers. Performs scheduled searches across PubMed, OpenAlex, and Semantic Scholar with automatic d... It is an AI Agent Skill for Claude Code / OpenClaw, with 239 downloads so far.

How do I install Daily Literature Search?

Run "/install daily-literature-search" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Daily Literature Search free?

Yes, Daily Literature Search is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Daily Literature Search support?

Daily Literature Search is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Daily Literature Search?

It is built and maintained by Wzr101622 (@wzr101622); the current version is v1.0.0.

More Skills

Daily Literature Search