/install arxiv-summarizer-orchestrator
\r \r
ArXiv Summarizer Orchestrator\r
\r Run the full pipeline by composing three sub-skills.\r \r
Sub-skill Order\r
\r
arxiv-search-collector\rarxiv-paper-processor\rarxiv-batch-reporter\r \r
Workflow Parameters\r
\r
language: manual language parameter used by all stages. Default is English when omitted.\rpaper_processing_mode:subagent_parallelorserial.\rmax_parallel_papers: default5whenpaper_processing_mode=subagent_parallel.\r \r
Workflow\r
\r
Stage A: Collection Setup + Query Retrieval\r
\r
- Initialize one run with
arxiv-search-collector/scripts/init_collection_run.py.\r - Model generates multiple focused queries from original topic and writes a minimal
query_plan.json(label+queryonly).\r - Run
arxiv-search-collector/scripts/fetch_queries_batch.pywith the plan file (recommended).\r - (Optional fallback) call
arxiv-search-collector/scripts/fetch_query_metadata.pymanually for one-by-one fetch.\r - Model reads each indexed query list and decides keep indexes.\r
- Merge selected items with
arxiv-search-collector/scripts/merge_selected_papers.py.\r - If relevance/coverage is still not good, iterate Stage A:\r
- generate another query plan with new labels,\r
- fetch again,\r
- re-merge with
--incrementaland updatedselection-json.\r - set weak labels to empty keep list (
[]) to explicitly drop them.\r \r Pass--language \x3CLANG>to collector scripts so all generated markdown files in Stage A follow the selected language.\r Use serial query fetch in Stage A with conservative controls (for example--min-interval-sec 5,--retry-max 4).\r Default collector settings already include retries/backoff and run-local throttle state (\x3Crun_dir>/.runtime/arxiv_api_state.json), so manual tuning is usually unnecessary.\r Prefer cache reuse (no--force) unless query parameters changed or data refresh is required.\r \r Output: one run directory with per-paper metadata subdirectories.\r \r
Stage B: Per-paper Artifact Download + Manual Summary\r
\r
For each paper directory, invoke sub-skill arxiv-paper-processor once and let that skill produce \x3Cpaper_dir>/summary.md.\r
\r
Recommended pre-step for many papers:\r
\r
- Run one batch artifact download before per-paper reading:\r \r
python3 arxiv-paper-processor/scripts/download_papers_batch.py \\r
--run-dir /path/to/run \\r
--artifact source_then_pdf \\r
--max-workers 3 \\r
--min-interval-sec 5 \\r
--language \x3CLANG>\r
```\r
\r
Per-paper execution steps (inside `arxiv-paper-processor`):\r
\r
1. If `\x3Cpaper_dir>/summary.md` already exists and is complete, skip this paper.\r
2. If usable source (`source/source_extract/*.tex`) or PDF (`source/paper.pdf`) already exists, skip download.\r
3. If artifacts are missing, download source with `arxiv-paper-processor/scripts/download_arxiv_source.py`.\r
4. If source is unusable, download PDF with `arxiv-paper-processor/scripts/download_arxiv_pdf.py`.\r
5. Model reads content and manually writes `\x3Cpaper_dir>/summary.md` by reference format, in `language`.\r
\r
Parallel strategy for many papers:\r
\r
- Default: `paper_processing_mode=subagent_parallel` with `max_parallel_papers=5`.\r
- Optional: `paper_processing_mode=serial` to process one paper at a time.\r
- In parallel mode, run multiple `arxiv-paper-processor` instances in batches; concurrent papers must not exceed `max_parallel_papers`.\r
- Wait for one batch to finish before starting the next batch.\r
- In serial mode, run exactly one `arxiv-paper-processor` instance at a time.\r
- Subagent workers should only own one paper directory each to avoid file conflicts.\r
- Do not use scripts to auto-compose summary text; scripts are download-only tools.\r
\r
Output: all paper directories contain `summary.md`.\r
\r
### Stage C: Bundle + Final Hierarchical Report\r
\r
1. Run `arxiv-batch-reporter/scripts/collect_summaries_bundle.py --language \x3CLANG>`.\r
2. Model reads `summaries_bundle.md` and writes `collection_report_template.md` in base dir.\r
3. In template, each paper leaf entry must include one standalone placeholder line: `{{ARXIV_BRIEF:\x3Carxiv_id>}}`.\r
4. Run `arxiv-batch-reporter/scripts/render_collection_report.py` to generate final `collection_report.md`.\r
5. Do not manually paraphrase per-paper conclusion lines in final report; they must come from per-paper `summary.md` section 10 via script injection.\r
\r
If `language` is non-English (for example Chinese), all intermediate markdown files and final reports should follow that language.\r
\r
## Periodic Scheduling\r
\r
This orchestrator is suitable for cron/scheduled execution in OpenClaw:\r
\r
- Frequency examples: daily, weekly, monthly.\r
- For rolling windows, use lookback (`1d`, `7d`, `30d`) when initializing runs.\r
\r
## Output Layout\r
\r
`\x3Coutput-root>/\x3Ctopic>-\x3Ctimestamp>-\x3Crange>/`\r
\r
- `task_meta.json`, `task_meta.md`\r
- `query_results/`, `query_selection/`\r
- `\x3Carxiv_id>/metadata.md` + downloaded source/pdf + `summary.md`\r
- `summaries_bundle.md`\r
- `collection_report_template.md`\r
- final rendered collection report (e.g. `collection_report.md`)\r
\r
Use `references/workflow-checklist.md` as execution checklist.\r
\r
## Related Skills\r
\r
This is the top-level orchestration skill.\r
\r
Before using it, install and enable these three sub-skills:\r
\r
- `arxiv-search-collector`\r
- `arxiv-paper-processor`\r
- `arxiv-batch-reporter`\r
\r
Execution order inside this orchestrator:\r
\r
1. `arxiv-search-collector` (Stage A)\r
2. `arxiv-paper-processor` (Stage B)\r
3. `arxiv-batch-reporter` (Stage C)\r
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install arxiv-summarizer-orchestrator - After installation, invoke the skill by name or use
/arxiv-summarizer-orchestrator - Provide required inputs per the skill's parameter spec and get structured output
What is Arxiv Summarizer Orchestrator?
Orchestrates end-to-end arXiv paper retrieval, processing, and batch reporting with language control and parallel or serial paper handling modes. It is an AI Agent Skill for Claude Code / OpenClaw, with 866 downloads so far.
How do I install Arxiv Summarizer Orchestrator?
Run "/install arxiv-summarizer-orchestrator" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Arxiv Summarizer Orchestrator free?
Yes, Arxiv Summarizer Orchestrator is completely free (open-source). You can download, install and use it at no cost.
Which platforms does Arxiv Summarizer Orchestrator support?
Arxiv Summarizer Orchestrator is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Arxiv Summarizer Orchestrator?
It is built and maintained by xukp20 (@xukp20); the current version is v0.1.1.