/install autoresearch-pilot
Autoresearch Pilot v1.0
Install: clawhub install autoresearch-pilot
Your co-pilot for Karpathy's autoresearch — autonomous AI-driven LLM training experiments on a single GPU.
Language
Detect from user's message language. Default: English.
How It Works
Autoresearch lets an AI agent modify train.py, run 5-minute experiments, check if val_bpb improved, and iterate. This skill helps you set it up, write optimal program.md, and interpret results.
The Three Files
| File | Role | Modified by |
|---|---|---|
prepare.py |
Data prep, tokenizer, utilities | Never (fixed) |
train.py |
Model, optimizer, training loop | The AI agent |
program.md |
Instructions for the AI agent | You (the human) |
Key Concepts
- val_bpb — Validation bits per byte. Lower = better. Vocab-size-independent metric.
- Time budget — Each experiment runs exactly 5 minutes (wall clock). ~100 experiments per night.
- Muon optimizer — Included. Often outperforms AdamW for small models.
- DEPTH — Primary model complexity knob (default 8). Lower for smaller GPUs.
Setup Guide
Walk the user through these steps when they want to start:
- Prerequisites: Python 3.10+, NVIDIA GPU (H100 recommended),
uvpackage manager - Clone repo:
git clone https://github.com/karpathy/autoresearch - Install:
uv syncinside the repo - Prepare data:
uv run prepare.py(one-time, ~2 min) - Test run:
uv run train.py(should complete in ~5 min) - Point your AI agent at program.md and let it experiment
Small GPU Tips (RTX 3090, Macbook, etc.)
When the user has a smaller GPU, suggest these prepare.py changes:
- Use TinyStories dataset (lower entropy, works with small models)
- Lower
vocab_sizeto 4096 or 2048 (or 256 for byte-level) - Lower
MAX_SEQ_LENto 256 - Lower
DEPTHto 4 intrain.py - Use
WINDOW_PATTERNof"L"only - Lower
TOTAL_BATCH_SIZEto2**14
Writing program.md
When the user asks for help with program.md, help them define:
- Research goal — What to optimize for (speed, quality, efficiency)
- Experiment strategy — What to try first, what to vary
- Success criteria — Target val_bpb or improvement threshold
- Safety guardrails — What the agent should NOT change
Example structure for program.md:
- State the goal clearly
- List allowed modifications (architecture, hyperparams, optimizer)
- Define experiment logging format
- Set a stopping condition (e.g., "stop after 50 experiments with no improvement")
Interpreting Results
When the user shares experiment logs:
| Metric | Good | Bad |
|---|---|---|
| val_bpb decreasing | Model is learning | Check for bugs |
| val_bpb plateaued | May need architecture change | Normal for small models |
| Training loss \x3C\x3C val loss | Overfitting | Increase regularization |
| NaN loss | Learning rate too high or instability | Lower LR, check gradients |
Quick Commands
| User says | Action |
|---|---|
| "set up autoresearch" | Walk through setup steps |
| "help me write program.md" | Draft research instructions |
| "my val_bpb is X" | Evaluate and suggest next steps |
| "optimize for small GPU" | Suggest parameter changes |
| "what should I try next" | Analyze recent experiments, propose new direction |
Guidelines for Agent
- Read-only guidance — suggest changes, let the user apply them
- Check GPU capability — ask what GPU they have before recommending parameters
- Start simple — recommend TinyStories + DEPTH 4 for first-time users
- Explain val_bpb — many users are new to this metric
- Refer to autoresearch repo — it's the source of truth for all defaults
- No exec — guide only, never run training commands
What This Skill Does NOT Do
- Does NOT run training commands or experiments
- Does NOT modify train.py or prepare.py directly
- Does NOT require an NVIDIA GPU (guidance works for any platform)
- Does NOT access credentials or private data
- Does NOT write any files — pure advisory
More by TommoT2
- setup-doctor — Diagnose and fix OpenClaw setup issues
- context-brief — Persistent context survival across sessions
- model-pilot — Intelligent model routing and cost optimization
Install the full suite:
clawhub install autoresearch-pilot setup-doctor context-brief model-pilot
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install autoresearch-pilot - 安装完成后,直接呼叫该 Skill 的名称或使用
/autoresearch-pilot触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
Autoresearch Pilot 是什么?
Guide for setting up and running Karpathy's autoresearch — autonomous AI-driven LLM training experiments. Helps write program.md, interpret results, and opti... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 132 次。
如何安装 Autoresearch Pilot?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install autoresearch-pilot」即可一键安装,无需额外配置。
Autoresearch Pilot 是免费的吗?
是的,Autoresearch Pilot 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Autoresearch Pilot 支持哪些平台?
Autoresearch Pilot 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Autoresearch Pilot?
由 TommoT2(@tommot2)开发并维护,当前版本 v1.0.0。