← Back to Skills Marketplace
bowen31337

AutoResearch Pipeline

by bowen31337 · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
104
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install autoresearch-pipeline
Description
Automates nightly research by rotating topics, fetching from arXiv, GitHub, Hacker News, Brave Search, and generating structured markdown reports with Telegr...
README (SKILL.md)

autoresearch — Nightly Research Pipeline

A zero-cost nightly research aggregator that rotates through 3 topic tracks, pulls from 4 independent sources, synthesises a structured markdown report, and prints a 3-line Telegram teaser to stdout.

Quick Start

# Dry run — fetches real data, prints teaser, no file writes
cd ~/.openclaw/workspace
uv run --with httpx python skills/autoresearch/scripts/run.py --dry-run

# Full run — writes to memory/ and advances state
uv run --with httpx python skills/autoresearch/scripts/run.py

# Force a specific track
uv run --with httpx python skills/autoresearch/scripts/run.py --track crypto

# Verbose output (debug logs to stderr)
uv run --with httpx python skills/autoresearch/scripts/run.py --dry-run --verbose

Tracks

The pipeline rotates through 3 tracks in order (persisted in state.json):

Track Display Name Sources Focus
ai AI & Agents cs.AI/MA/CL/LG arXiv, Python/Rust/TS GitHub, LLM HN keywords
crypto Crypto & DeFi cs.CR/DC arXiv, Solidity/Rust GitHub, crypto HN keywords
devtools Developer Tools cs.SE/PL arXiv, Rust/Go/TS/Python GitHub, CLI/editor HN keywords

Sources

Source API Auth Fallback
arXiv Atom API (export.arxiv.org) None Returns [] on error
GitHub Trending Public HTML scrape None Returns [] on structure change
Hacker News Firebase JSON API None Returns partial results
Web Search Brave Search API BRAVE_API_KEY env Skipped silently if no key

Output Files

File Description
memory/autoresearch-latest.md Overwritten each run — latest report
memory/autoresearch-archive.md Append-only — all runs with date markers
memory/autoresearch-errors.log Stderr from cron runs

CLI Flags

Flag Default Description
--track ai|crypto|devtools Rotate Override track rotation for this run
--dry-run off Fetch + synthesise but skip file writes and state advance
--verbose off Print DEBUG logs to stderr

Exit Codes

Code Meaning
0 Success
1 All sources failed OR disk write failed
2 Config/state error (config.json missing, bad --track value)

Configuration

Edit config.json to customise per-track queries:

{
  "tracks": {
    "ai": {
      "arxiv_categories": ["cs.AI", "cs.MA", "cs.CL", "cs.LG"],
      "arxiv_keywords": ["agent", "LLM", ...],
      "github_languages": ["python", "rust", "typescript"],
      "github_topics": ["ai-agent", "llm", ...],
      "hn_keywords": ["AI", "GPT", "Claude", ...],
      "web_queries": ["AI agent framework news 2026", ...]
    }
  }
}

Cron Integration

# Add to OpenClaw cron: 1 AM Sydney (14:00 UTC previous day)
# The cron wrapper captures stdout and sends to Telegram
0 14 * * * cd ~/.openclaw/workspace && uv run --with httpx python skills/autoresearch/scripts/run.py 2>>~/.openclaw/workspace/memory/autoresearch-errors.log

The script prints a 3-line teaser to stdout:

🔬 **Nightly Research: AI & Agents**
• Top paper: Scaling Laws for Agent Reasoning… — We study how reasoning…
• Trending: microsoft/autogen ⭐342 | HN: Show HN: I built…

The cron agent captures stdout and sends it to Telegram via the message tool.

State

State is persisted in state.json:

{
  "current_track_index": 1,
  "last_run": "2026-03-15T14:02:31.123456+00:00",
  "last_tracks": ["ai"]
}

State only advances on a successful run (exit 0). If all sources fail, state stays at the same track so tomorrow retries the same track.

Dependencies

  • httpx — all HTTP (via uv run --with httpx)
  • xml.etree.ElementTree — arXiv Atom XML parsing (stdlib)
  • json, re, asyncio, argparse, pathlib — stdlib

No additional dependencies needed. No pyproject.toml required in the skill dir.

Integration with Book Draft

Other cron jobs or agents can read the latest report directly:

cat ~/.openclaw/workspace/memory/autoresearch-latest.md

Or in Python:

from pathlib import Path
report = Path.home() / ".openclaw/workspace/memory/autoresearch-latest.md"
content = report.read_text()

File Structure

skills/autoresearch/
├── SKILL.md          # This file
├── PLAN.md           # Architecture and spec
├── config.json       # Track definitions + source config
├── state.json        # Runtime state (auto-managed)
└── scripts/
    ├── run.py        # CLI entrypoint (main pipeline)
    ├── sources.py    # Data fetchers (arXiv, GitHub, HN, web)
    ├── synthesise.py # Report builder (markdown synthesis)
    └── state.py      # Track rotation state machine
Usage Guidance
This skill appears to implement the advertised nightly research pipeline, but there are two things to check before installing: 1) Chromium requirement: SKILL.md and README claim a Chromium browser is required, but the code does not use browser automation. That requirement is unnecessary and may confuse security policies that gate browser-capable skills. Ask the author to remove the anyBins metadata or justify browser usage. 2) Brave API key handling: the skill will use BRAVE_API_KEY if present (reasonable), but its helper will also scan ~/.openclaw/config.json and ~/.openclaw/agents/main/config.json for keys. If you keep API keys in OpenClaw config files you do not want other skills reading, this is risky. Prefer setting BRAVE_API_KEY explicitly for the agent that will run this skill, or modify the helper to avoid scanning other config files. Other recommendations: - Run a dry-run first (the repo includes --dry-run) and inspect the outputs and stderr logs. - Audit any local OpenClaw config files for sensitive keys and restrict access if needed. - If you expect strict least-privilege behavior, request the author to remove the config-file probing and the misleading Chromium requirement, and/or run the skill in a sandboxed environment. I flagged these as 'suspicious' (not malicious): the mismatches look like convenience shortcuts or sloppy metadata rather than intentional exfiltration. If the author provides a clear justification for reading OpenClaw configs (for example: documented operator workflow where a central Brave key is intentionally shared), these concerns would be resolved.
Capability Tags
cryptorequires-sensitive-credentials
Capability Assessment
Purpose & Capability
The skill's stated purpose (nightly aggregation from arXiv, GitHub, HN, Brave Search) matches the code and files provided. However, the SKILL.md metadata and README assert a need for a Chromium browser (anyBins: chromium-browser/chromium/google-chrome) and mention browser automation; none of the included scripts use a headless browser or drive Chromium. This Chromium requirement is unnecessary for the implemented HTTP + regex + Brave API approach and is incoherent with the implementation. Also, web_search_helper attempts to read OpenClaw config files to locate a Brave API key — accessing other agents' config files is not strictly necessary to implement the stated functionality and expands the skill's scope.
Instruction Scope
Runtime instructions (SKILL.md and examples) correctly describe running the Python scripts, dry-run behavior, file outputs, and that stdout is used for Telegram teasers. The actual code stays within the described sources (arXiv, GitHub, Hacker News, Brave). The one scope creep is web_search_helper: when BRAVE_API_KEY isn't present in environment, it searches for keys inside ~/.openclaw/config.json and ~/.openclaw/agents/main/config.json. That directs the skill to read other agent/system config files — broader than a single-service API key lookup and worth reviewing. Otherwise instructions are concrete (no vague 'gather whatever context you need').
Install Mechanism
There is no install spec (instruction-only skill), which minimizes install-time risk. The code relies on httpx and the runtime note 'uv run --with httpx' matches the dependency model. No remote downloads or arbitrary install/extract operations are present in the manifest.
Credentials
The skill doesn’t declare required env vars but does read BRAVE_API_KEY (and OPENCLAW_BRAVE_KEY) if present — reasonable for sending queries to Brave Search. However, web_search_helper falls back to reading other OpenClaw config files to extract a Brave key if the environment variables are not set. Attempting to parse other agents' config files to locate API keys is disproportionate to the stated purpose and raises privacy/credential exposure concerns. No other unrelated credentials are requested, and no secrets are hardcoded.
Persistence & Privilege
The skill writes only to the documented workspace paths (memory/autoresearch-latest.md, memory/autoresearch-archive.md) and maintains a small state.json in the skill directory. always:true is not set and the skill does not modify other skills' configs. The only notable persistence-related action beyond the description is reading OpenClaw config files (see environment_proportionality).
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install autoresearch-pipeline
  3. After installation, invoke the skill by name or use /autoresearch-pipeline
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
- Initial release of autoresearch-pipeline. - Provides a nightly research aggregator with automated track rotation and Markdown report synthesis. - Integrates four independent sources: arXiv, GitHub Trending, Hacker News, and Brave Search. - Outputs both a detailed report and a concise Telegram teaser. - Includes CLI options for track selection, dry run mode, and verbose logging. - Requires a Chromium-based browser for web automation tasks.
Metadata
Slug autoresearch-pipeline
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is AutoResearch Pipeline?

Automates nightly research by rotating topics, fetching from arXiv, GitHub, Hacker News, Brave Search, and generating structured markdown reports with Telegr... It is an AI Agent Skill for Claude Code / OpenClaw, with 104 downloads so far.

How do I install AutoResearch Pipeline?

Run "/install autoresearch-pipeline" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is AutoResearch Pipeline free?

Yes, AutoResearch Pipeline is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does AutoResearch Pipeline support?

AutoResearch Pipeline is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created AutoResearch Pipeline?

It is built and maintained by bowen31337 (@bowen31337); the current version is v1.0.0.

💬 Comments