← 返回 Skills 市场
elio040208

Arxiv Paper Reader

作者 elio040208 · GitHub ↗ · v1.0.3 · MIT-0
win32linuxdarwin ✓ 安全检测通过
179
总下载
0
收藏
1
当前安装
4
版本数
在 OpenClaw 中安装
/install arxiv-paper-reader
功能描述
Search arXiv by keyword, filter by submitted date range, fetch arXiv papers from an arXiv ID or URL, convert papers into Markdown and PDF files in the worksp...
使用说明 (SKILL.md)

arXiv Paper Reader

Use the bundled Python scripts before reasoning about arXiv content. They handle:

  • searching arXiv by keyword
  • filtering keyword results by submitted date range
  • downloading arXiv metadata and paper content
  • converting papers to Markdown and PDF in the workspace
  • syncing configured topics into daily archive folders

Inputs

  • Accept raw arXiv IDs like 1706.03762 or URLs such as https://arxiv.org/abs/1706.03762.
  • Only accept raw IDs or HTTPS arXiv URLs on arxiv.org, www.arxiv.org, or export.arxiv.org.
  • Accept keyword searches such as transformer, diffusion, or computer vision.
  • Accept optional submitted-date windows using YYYY-MM-DD.
  • Do not use category filters or alias-based domain shortcuts; search is intentionally keyword-only.

Search workflow

  1. Pick a Python command:
    • Prefer python
    • Fall back to python3
  2. If the user wants search results or the latest papers for a topic, run:
python {baseDir}/scripts/search_arxiv.py --query "\x3Ckeywords>" --limit \x3Cn>
  1. Read search_results.md and search_results.json.
  2. Use {baseDir}/references/search-usage.md to present the results.
  3. If the user asks for the latest papers matching a keyword, pass --sort submittedDate.
  4. If the user wants the default best-match ranking, omit --sort and let the script use relevance order.
  5. If the user gives a date window, add --start-date YYYY-MM-DD --end-date YYYY-MM-DD.

Topic sync workflow

  1. Tell the user to maintain {rootDir}/topics.json, or seed it from {baseDir}/references/topics.example.json.
  2. For recurring daily updates, run:
python {baseDir}/scripts/sync_arxiv_topics.py --daily --root-dir \x3Croot-dir>
  1. For manual backfill, run:
python {baseDir}/scripts/sync_arxiv_topics.py --start-date YYYY-MM-DD --end-date YYYY-MM-DD --root-dir \x3Croot-dir>
  1. Read \x3Croot-dir>/runs/\x3Ccapture-date>/run_manifest.md first.
  2. Each captured paper lives at topics/\x3Ctopic-slug>/\x3Ccapture-date>/\x3Cpaper-id>__\x3Ctitle-slug>/.
  3. Expect each paper directory to contain paper.pdf, paper.md, metadata.json, and summary.md.
  4. The batch summary is template-based and grounded in the abstract plus converted Markdown; treat it as a review aid, not a substitute for reading the paper.

Fetch workflow

  1. Choose an output directory:
    • If the user gives one, use it.
    • Otherwise write to ./artifacts/arxiv/\x3Cpaper-id>/ in the current workspace.
  2. Run the converter:
python {baseDir}/scripts/arxiv_to_md.py \x3Cpaper-id-or-url> --output-dir \x3Ctarget-dir>
  1. Read the generated paper.pdf, paper.md, and metadata.json.
  2. Summarize the paper in Markdown.
  3. Save the summary to \x3Ctarget-dir>/summary.md if the user asked for files. Otherwise return the summary directly in chat.

Summary format

Use the headings in {baseDir}/references/summary-format.md.

Keep the summary grounded in the generated Markdown. If the conversion falls back to abstract-only mode, say so explicitly in the summary.

Safety

  • Pass IDs, URLs, and keywords as single CLI arguments. Do not splice untrusted text into shell pipelines.
  • Only pass raw arXiv IDs or HTTPS arXiv URLs; reject arbitrary third-party URLs.
  • TLS verification is strict. If requests fail because your machine lacks a valid CA bundle, install certifi or fix the system trust store.
  • arXiv source archives are processed in-memory, only .tex members are read, and suspicious paths plus oversized payloads are rejected before parsing.
  • Date windows use arXiv submittedDate and inclusive YYYY-MM-DD boundaries.
  • Do not invent claims that are not supported by paper.md or search_results.md.
  • Do not reintroduce hardcoded category or alias mappings; keep search behavior keyword-only.
安全使用建议
This skill appears coherent and limited to arXiv interactions, but exercise normal caution: 1) run it in a controlled workspace (it will create artifacts/ and monitor topic folders), 2) inspect the bundled scripts yourself before running (they will execute arbitrary Python locally), 3) ensure your environment has a proper CA bundle (the code enforces TLS and references certifi), and 4) don't point it at non-arXiv URLs — the code enforces allowed hosts but follow the SKILL.md rule to only pass raw arXiv IDs or arXiv HTTPS URLs. If you need higher assurance, run the scripts in an isolated environment (container/VM) and review the remaining truncated code paths before granting broad autonomous invocation.
功能分析
Type: OpenClaw Skill Name: arxiv-paper-reader Version: 1.0.3 The arxiv-paper-reader skill bundle is designed to search, download, and convert arXiv papers into Markdown and PDF formats. The Python scripts (arxiv_api.py, arxiv_to_md.py) include appropriate security measures, such as restricting network requests to official arXiv domains and implementing defenses against path traversal and resource exhaustion (zip bombs) during source archive extraction. The instructions in SKILL.md are well-aligned with the stated purpose and do not contain any malicious prompt-injection attempts.
能力评估
Purpose & Capability
Name/description (search/fetch/convert arXiv papers) matches the code and declared requirements. The scripts only require a Python interpreter and interact with arXiv endpoints (export.arxiv.org and arxiv.org). No unrelated binaries, credentials, or config paths are requested.
Instruction Scope
SKILL.md instructs the agent to run the included Python scripts, read/write files under workspace directories (artifacts/, topics.json, runs/), and to only accept raw arXiv IDs or arXiv HTTPS URLs. The instructions emphasize safety (pass args as single CLI args, strict TLS) and tell the agent to read the generated search_results.md/json and produced paper files. There is no instruction to read unrelated system files or environment variables.
Install Mechanism
No install spec — instruction-only with bundled scripts. This is low-risk from an installer perspective because nothing is downloaded or installed automatically. The only runtime requirement is a local Python interpreter.
Credentials
No environment variables, credentials, or config paths are required. The scripts write outputs to workspace subdirectories (artifacts/arxiv* and a configurable root-dir) which is appropriate for the described functionality.
Persistence & Privilege
always is false and the skill does not request persistent platform privileges. It does create and update files under user-specified or default workspace directories (artifacts, runs, sync_state.json). That file I/O is expected for an archiving/syncing tool.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install arxiv-paper-reader
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /arxiv-paper-reader 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.3
**arxiv-paper-reader 1.0.3 changelog** - Added support for filtering keyword searches by submitted date range. - Introduced recurring topic sync with daily archive folders and summary generation. - Added references/topics.example.json and scripts/sync_arxiv_topics.py to support topic management and sync features. - Fetch workflow now produces both Markdown and PDF files for each paper. - Updated documentation to reflect new inputs, workflows, and file outputs.
v1.0.2
arxiv-paper-reader 1.0.2 - Search is now keyword-only: category filters and alias-based shortcuts removed. - Only raw arXiv IDs and HTTPS arXiv URLs are accepted for fetching papers; stricter URL validation. - Improved safety: third-party URLs are rejected, and stricter TLS/CA handling enforced. - arXiv source archives are processed securely—only `.tex` files are read and extra checks are applied. - Search command and workflows updated to reflect new restrictions and clearer sorting logic.
v1.0.1
Major update: Adds arXiv search and browsing, improves workflows for both papers and searches. - New: Search arXiv by keyword or category, and list the latest papers in a domain using new scripts. - Added support for category codes and common AI/CS aliases (e.g., nlp, cv, ml). - Clearer workflows: separate paths for searching/browsing and for fetching specific papers. - Includes new documentation on search usage and updated summary and safety instructions. - Maintains safe CLI argument handling for all new inputs.
v1.0.0
Initial release of arXiv Paper Reader. - Fetches arXiv papers by ID or URL and converts them to Markdown in the workspace. - Automatically writes a concise summary based on the converted paper. - Supports multiple paper inputs, storing each in a separate directory. - Ensures summaries remain grounded in the content provided—explicitly notes if only the abstract is available. - Designed for safe usage by handling untrusted inputs securely and avoiding unsupported claims.
元数据
Slug arxiv-paper-reader
版本 1.0.3
许可证 MIT-0
累计安装 1
当前安装数 1
历史版本数 4
常见问题

Arxiv Paper Reader 是什么?

Search arXiv by keyword, filter by submitted date range, fetch arXiv papers from an arXiv ID or URL, convert papers into Markdown and PDF files in the worksp... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 179 次。

如何安装 Arxiv Paper Reader?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install arxiv-paper-reader」即可一键安装,无需额外配置。

Arxiv Paper Reader 是免费的吗?

是的,Arxiv Paper Reader 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Arxiv Paper Reader 支持哪些平台?

Arxiv Paper Reader 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(win32, linux, darwin)。

谁开发了 Arxiv Paper Reader?

由 elio040208(@elio040208)开发并维护,当前版本 v1.0.3。

💬 留言讨论