功能描述

Automates nightly research by rotating topics, fetching from arXiv, GitHub, Hacker News, Brave Search, and generating structured markdown reports with Telegr...

使用说明 (SKILL.md)

autoresearch — Nightly Research Pipeline

Name: AutoResearch Pipeline
Author: bowen31337

A zero-cost nightly research aggregator that rotates through 3 topic tracks, pulls from 4 independent sources, synthesises a structured markdown report, and prints a 3-line Telegram teaser to stdout.

Quick Start

# Dry run — fetches real data, prints teaser, no file writes
cd ~/.openclaw/workspace
uv run --with httpx python skills/autoresearch/scripts/run.py --dry-run

# Full run — writes to memory/ and advances state
uv run --with httpx python skills/autoresearch/scripts/run.py

# Force a specific track
uv run --with httpx python skills/autoresearch/scripts/run.py --track crypto

# Verbose output (debug logs to stderr)
uv run --with httpx python skills/autoresearch/scripts/run.py --dry-run --verbose

Tracks

The pipeline rotates through 3 tracks in order (persisted in state.json):

Track	Display Name	Sources Focus
`ai`	AI & Agents	cs.AI/MA/CL/LG arXiv, Python/Rust/TS GitHub, LLM HN keywords
`crypto`	Crypto & DeFi	cs.CR/DC arXiv, Solidity/Rust GitHub, crypto HN keywords
`devtools`	Developer Tools	cs.SE/PL arXiv, Rust/Go/TS/Python GitHub, CLI/editor HN keywords

Sources

Source	API	Auth	Fallback
arXiv	Atom API (`export.arxiv.org`)	None	Returns `[]` on error
GitHub Trending	Public HTML scrape	None	Returns `[]` on structure change
Hacker News	Firebase JSON API	None	Returns partial results
Web Search	Brave Search API	`BRAVE_API_KEY` env	Skipped silently if no key

Output Files

File	Description
`memory/autoresearch-latest.md`	Overwritten each run — latest report
`memory/autoresearch-archive.md`	Append-only — all runs with date markers
`memory/autoresearch-errors.log`	Stderr from cron runs

CLI Flags

Flag	Default	Description
`--track ai\|crypto\|devtools`	Rotate	Override track rotation for this run
`--dry-run`	off	Fetch + synthesise but skip file writes and state advance
`--verbose`	off	Print DEBUG logs to stderr

Exit Codes

Code	Meaning
`0`	Success
`1`	All sources failed OR disk write failed
`2`	Config/state error (config.json missing, bad --track value)

Configuration

Edit config.json to customise per-track queries:

{
  "tracks": {
    "ai": {
      "arxiv_categories": ["cs.AI", "cs.MA", "cs.CL", "cs.LG"],
      "arxiv_keywords": ["agent", "LLM", ...],
      "github_languages": ["python", "rust", "typescript"],
      "github_topics": ["ai-agent", "llm", ...],
      "hn_keywords": ["AI", "GPT", "Claude", ...],
      "web_queries": ["AI agent framework news 2026", ...]
    }
  }
}

Cron Integration

# Add to OpenClaw cron: 1 AM Sydney (14:00 UTC previous day)
# The cron wrapper captures stdout and sends to Telegram
0 14 * * * cd ~/.openclaw/workspace && uv run --with httpx python skills/autoresearch/scripts/run.py 2>>~/.openclaw/workspace/memory/autoresearch-errors.log

The script prints a 3-line teaser to stdout:

🔬 **Nightly Research: AI & Agents**
• Top paper: Scaling Laws for Agent Reasoning… — We study how reasoning…
• Trending: microsoft/autogen ⭐342 | HN: Show HN: I built…

The cron agent captures stdout and sends it to Telegram via the message tool.

State

State is persisted in state.json:

{
  "current_track_index": 1,
  "last_run": "2026-03-15T14:02:31.123456+00:00",
  "last_tracks": ["ai"]
}

State only advances on a successful run (exit 0). If all sources fail, state stays at the same track so tomorrow retries the same track.

Dependencies

httpx — all HTTP (via uv run --with httpx)
xml.etree.ElementTree — arXiv Atom XML parsing (stdlib)
json, re, asyncio, argparse, pathlib — stdlib

No additional dependencies needed. No pyproject.toml required in the skill dir.

Integration with Book Draft

Other cron jobs or agents can read the latest report directly:

cat ~/.openclaw/workspace/memory/autoresearch-latest.md

Or in Python:

from pathlib import Path
report = Path.home() / ".openclaw/workspace/memory/autoresearch-latest.md"
content = report.read_text()

File Structure

skills/autoresearch/
├── SKILL.md          # This file
├── PLAN.md           # Architecture and spec
├── config.json       # Track definitions + source config
├── state.json        # Runtime state (auto-managed)
└── scripts/
    ├── run.py        # CLI entrypoint (main pipeline)
    ├── sources.py    # Data fetchers (arXiv, GitHub, HN, web)
    ├── synthesise.py # Report builder (markdown synthesis)
    └── state.py      # Track rotation state machine

安全使用建议

This skill appears to implement the advertised nightly research pipeline, but there are two things to check before installing: 1) Chromium requirement: SKILL.md and README claim a Chromium browser is required, but the code does not use browser automation. That requirement is unnecessary and may confuse security policies that gate browser-capable skills. Ask the author to remove the anyBins metadata or justify browser usage. 2) Brave API key handling: the skill will use BRAVE_API_KEY if present (reasonable), but its helper will also scan ~/.openclaw/config.json and ~/.openclaw/agents/main/config.json for keys. If you keep API keys in OpenClaw config files you do not want other skills reading, this is risky. Prefer setting BRAVE_API_KEY explicitly for the agent that will run this skill, or modify the helper to avoid scanning other config files. Other recommendations: - Run a dry-run first (the repo includes --dry-run) and inspect the outputs and stderr logs. - Audit any local OpenClaw config files for sensitive keys and restrict access if needed. - If you expect strict least-privilege behavior, request the author to remove the config-file probing and the misleading Chromium requirement, and/or run the skill in a sandboxed environment. I flagged these as 'suspicious' (not malicious): the mismatches look like convenience shortcuts or sloppy metadata rather than intentional exfiltration. If the author provides a clear justification for reading OpenClaw configs (for example: documented operator workflow where a central Brave key is intentionally shared), these concerns would be resolved.

能力标签

cryptorequires-sensitive-credentials

能力评估

⚠ Purpose & Capability

The skill's stated purpose (nightly aggregation from arXiv, GitHub, HN, Brave Search) matches the code and files provided. However, the SKILL.md metadata and README assert a need for a Chromium browser (anyBins: chromium-browser/chromium/google-chrome) and mention browser automation; none of the included scripts use a headless browser or drive Chromium. This Chromium requirement is unnecessary for the implemented HTTP + regex + Brave API approach and is incoherent with the implementation. Also, web_search_helper attempts to read OpenClaw config files to locate a Brave API key — accessing other agents' config files is not strictly necessary to implement the stated functionality and expands the skill's scope.

ℹ Instruction Scope

Runtime instructions (SKILL.md and examples) correctly describe running the Python scripts, dry-run behavior, file outputs, and that stdout is used for Telegram teasers. The actual code stays within the described sources (arXiv, GitHub, Hacker News, Brave). The one scope creep is web_search_helper: when BRAVE_API_KEY isn't present in environment, it searches for keys inside ~/.openclaw/config.json and ~/.openclaw/agents/main/config.json. That directs the skill to read other agent/system config files — broader than a single-service API key lookup and worth reviewing. Otherwise instructions are concrete (no vague 'gather whatever context you need').

✓ Install Mechanism

There is no install spec (instruction-only skill), which minimizes install-time risk. The code relies on httpx and the runtime note 'uv run --with httpx' matches the dependency model. No remote downloads or arbitrary install/extract operations are present in the manifest.

⚠ Credentials

The skill doesn’t declare required env vars but does read BRAVE_API_KEY (and OPENCLAW_BRAVE_KEY) if present — reasonable for sending queries to Brave Search. However, web_search_helper falls back to reading other OpenClaw config files to extract a Brave key if the environment variables are not set. Attempting to parse other agents' config files to locate API keys is disproportionate to the stated purpose and raises privacy/credential exposure concerns. No other unrelated credentials are requested, and no secrets are hardcoded.

✓ Persistence & Privilege

The skill writes only to the documented workspace paths (memory/autoresearch-latest.md, memory/autoresearch-archive.md) and maintains a small state.json in the skill directory. always:true is not set and the skill does not modify other skills' configs. The only notable persistence-related action beyond the description is reading OpenClaw config files (see environment_proportionality).

版本历史

v1.0.0

- Initial release of autoresearch-pipeline. - Provides a nightly research aggregator with automated track rotation and Markdown report synthesis. - Integrates four independent sources: arXiv, GitHub Trending, Hacker News, and Brave Search. - Outputs both a detailed report and a concise Telegram teaser. - Includes CLI options for track selection, dry run mode, and verbose logging. - Requires a Chromium-based browser for web automation tasks.

元数据

Slug autoresearch-pipeline

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

AutoResearch Pipeline 是什么？

Automates nightly research by rotating topics, fetching from arXiv, GitHub, Hacker News, Brave Search, and generating structured markdown reports with Telegr... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 104 次。

如何安装 AutoResearch Pipeline？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install autoresearch-pipeline」即可一键安装，无需额外配置。

AutoResearch Pipeline 是免费的吗？

是的，AutoResearch Pipeline 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

AutoResearch Pipeline 支持哪些平台？

AutoResearch Pipeline 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 AutoResearch Pipeline？

由 bowen31337（@bowen31337）开发并维护，当前版本 v1.0.0。

AutoResearch Pipeline