← 返回 Skills 市场
tommot2

Autoresearch Pilot

作者 TommoT2 · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ 安全检测通过
132
总下载
0
收藏
1
当前安装
1
版本数
在 OpenClaw 中安装
/install autoresearch-pilot
功能描述
Guide for setting up and running Karpathy's autoresearch — autonomous AI-driven LLM training experiments. Helps write program.md, interpret results, and opti...
使用说明 (SKILL.md)

Autoresearch Pilot v1.0

Install: clawhub install autoresearch-pilot

Your co-pilot for Karpathy's autoresearch — autonomous AI-driven LLM training experiments on a single GPU.

Language

Detect from user's message language. Default: English.

How It Works

Autoresearch lets an AI agent modify train.py, run 5-minute experiments, check if val_bpb improved, and iterate. This skill helps you set it up, write optimal program.md, and interpret results.

The Three Files

File Role Modified by
prepare.py Data prep, tokenizer, utilities Never (fixed)
train.py Model, optimizer, training loop The AI agent
program.md Instructions for the AI agent You (the human)

Key Concepts

  • val_bpb — Validation bits per byte. Lower = better. Vocab-size-independent metric.
  • Time budget — Each experiment runs exactly 5 minutes (wall clock). ~100 experiments per night.
  • Muon optimizer — Included. Often outperforms AdamW for small models.
  • DEPTH — Primary model complexity knob (default 8). Lower for smaller GPUs.

Setup Guide

Walk the user through these steps when they want to start:

  1. Prerequisites: Python 3.10+, NVIDIA GPU (H100 recommended), uv package manager
  2. Clone repo: git clone https://github.com/karpathy/autoresearch
  3. Install: uv sync inside the repo
  4. Prepare data: uv run prepare.py (one-time, ~2 min)
  5. Test run: uv run train.py (should complete in ~5 min)
  6. Point your AI agent at program.md and let it experiment

Small GPU Tips (RTX 3090, Macbook, etc.)

When the user has a smaller GPU, suggest these prepare.py changes:

  • Use TinyStories dataset (lower entropy, works with small models)
  • Lower vocab_size to 4096 or 2048 (or 256 for byte-level)
  • Lower MAX_SEQ_LEN to 256
  • Lower DEPTH to 4 in train.py
  • Use WINDOW_PATTERN of "L" only
  • Lower TOTAL_BATCH_SIZE to 2**14

Writing program.md

When the user asks for help with program.md, help them define:

  1. Research goal — What to optimize for (speed, quality, efficiency)
  2. Experiment strategy — What to try first, what to vary
  3. Success criteria — Target val_bpb or improvement threshold
  4. Safety guardrails — What the agent should NOT change

Example structure for program.md:

  • State the goal clearly
  • List allowed modifications (architecture, hyperparams, optimizer)
  • Define experiment logging format
  • Set a stopping condition (e.g., "stop after 50 experiments with no improvement")

Interpreting Results

When the user shares experiment logs:

Metric Good Bad
val_bpb decreasing Model is learning Check for bugs
val_bpb plateaued May need architecture change Normal for small models
Training loss \x3C\x3C val loss Overfitting Increase regularization
NaN loss Learning rate too high or instability Lower LR, check gradients

Quick Commands

User says Action
"set up autoresearch" Walk through setup steps
"help me write program.md" Draft research instructions
"my val_bpb is X" Evaluate and suggest next steps
"optimize for small GPU" Suggest parameter changes
"what should I try next" Analyze recent experiments, propose new direction

Guidelines for Agent

  1. Read-only guidance — suggest changes, let the user apply them
  2. Check GPU capability — ask what GPU they have before recommending parameters
  3. Start simple — recommend TinyStories + DEPTH 4 for first-time users
  4. Explain val_bpb — many users are new to this metric
  5. Refer to autoresearch repo — it's the source of truth for all defaults
  6. No exec — guide only, never run training commands

What This Skill Does NOT Do

  • Does NOT run training commands or experiments
  • Does NOT modify train.py or prepare.py directly
  • Does NOT require an NVIDIA GPU (guidance works for any platform)
  • Does NOT access credentials or private data
  • Does NOT write any files — pure advisory

More by TommoT2

  • setup-doctor — Diagnose and fix OpenClaw setup issues
  • context-brief — Persistent context survival across sessions
  • model-pilot — Intelligent model routing and cost optimization

Install the full suite:

clawhub install autoresearch-pilot setup-doctor context-brief model-pilot
安全使用建议
This skill is a textual co‑pilot and does not install or run code by itself, which is good. Before following its instructions: (1) verify the GitHub repository URL and review the repo code (especially scripts like prepare.py/train.py) before running them; (2) confirm what the 'uv' package manager is and inspect any packages it installs; (3) be aware that running training jobs can consume significant GPU/time and may use or generate datasets you should check for licensing/privacy; (4) do not grant the agent remote execution rights or secrets — let it propose changes and run commands only when you explicitly approve and understand them. Overall the skill is coherent and advisory, but exercise normal caution when cloning/running third‑party training code.
功能分析
Type: OpenClaw Skill Name: autoresearch-pilot Version: 1.0.0 The autoresearch-pilot skill is a purely advisory guide designed to help users set up and optimize Andrej Karpathy's 'autoresearch' project. The SKILL.md file explicitly instructs the agent to provide read-only guidance, forbids the execution of training commands or file modifications, and contains no indicators of data exfiltration, malicious execution, or prompt injection.
能力评估
Purpose & Capability
Name/description match the instructions: the skill is a textual guide for setting up and running autoresearch. It does not request unrelated credentials, binaries, or config paths, so the capability footprint is proportionate to the stated purpose.
Instruction Scope
SKILL.md gives step-by-step guidance (clone repo, run commands locally, edit program.md) and explicitly says it will not exec or modify files. It does instruct the user/agent to run commands locally, but does not direct reading of unrelated system files or exfiltration of data.
Install Mechanism
No install spec and no code files — the skill is instruction-only, which minimizes risk from installation or on-disk code.
Credentials
The skill declares no required environment variables or credentials. It sensibly lists local prerequisites (Python, GPU) in prose only — there are no disproportionate secret or config requests.
Persistence & Privilege
always is false and the skill is user-invocable. It does not request persistent privileges or modify other skills or system-wide settings.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install autoresearch-pilot
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /autoresearch-pilot 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release. Setup guide, program.md writing, result interpretation, small GPU optimization tips.
元数据
Slug autoresearch-pilot
版本 1.0.0
许可证 MIT-0
累计安装 1
当前安装数 1
历史版本数 1
常见问题

Autoresearch Pilot 是什么?

Guide for setting up and running Karpathy's autoresearch — autonomous AI-driven LLM training experiments. Helps write program.md, interpret results, and opti... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 132 次。

如何安装 Autoresearch Pilot?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install autoresearch-pilot」即可一键安装,无需额外配置。

Autoresearch Pilot 是免费的吗?

是的,Autoresearch Pilot 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Autoresearch Pilot 支持哪些平台?

Autoresearch Pilot 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Autoresearch Pilot?

由 TommoT2(@tommot2)开发并维护,当前版本 v1.0.0。

💬 留言讨论