← 返回 Skills 市场

Autoresearch Pilot

Name: Autoresearch Pilot
Author: tommot2

作者 TommoT2 · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ✓ 安全检测通过

132

总下载

当前安装

版本数

在 OpenClaw 中安装

/install autoresearch-pilot

功能描述

Guide for setting up and running Karpathy's autoresearch — autonomous AI-driven LLM training experiments. Helps write program.md, interpret results, and opti...

使用说明 (SKILL.md)

Autoresearch Pilot v1.0

Install: clawhub install autoresearch-pilot

Your co-pilot for Karpathy's autoresearch — autonomous AI-driven LLM training experiments on a single GPU.

Language

Detect from user's message language. Default: English.

How It Works

Autoresearch lets an AI agent modify train.py, run 5-minute experiments, check if val_bpb improved, and iterate. This skill helps you set it up, write optimal program.md, and interpret results.

The Three Files

File	Role	Modified by
`prepare.py`	Data prep, tokenizer, utilities	Never (fixed)
`train.py`	Model, optimizer, training loop	The AI agent
`program.md`	Instructions for the AI agent	You (the human)

Key Concepts

val_bpb — Validation bits per byte. Lower = better. Vocab-size-independent metric.
Time budget — Each experiment runs exactly 5 minutes (wall clock). ~100 experiments per night.
Muon optimizer — Included. Often outperforms AdamW for small models.
DEPTH — Primary model complexity knob (default 8). Lower for smaller GPUs.

Setup Guide

Walk the user through these steps when they want to start:

Prerequisites: Python 3.10+, NVIDIA GPU (H100 recommended), uv package manager
Clone repo: git clone https://github.com/karpathy/autoresearch
Install: uv sync inside the repo
Prepare data: uv run prepare.py (one-time, ~2 min)
Test run: uv run train.py (should complete in ~5 min)
Point your AI agent at program.md and let it experiment

Small GPU Tips (RTX 3090, Macbook, etc.)

When the user has a smaller GPU, suggest these prepare.py changes:

Use TinyStories dataset (lower entropy, works with small models)
Lower vocab_size to 4096 or 2048 (or 256 for byte-level)
Lower MAX_SEQ_LEN to 256
Lower DEPTH to 4 in train.py
Use WINDOW_PATTERN of "L" only
Lower TOTAL_BATCH_SIZE to 2**14

Writing program.md

When the user asks for help with program.md, help them define:

Research goal — What to optimize for (speed, quality, efficiency)
Experiment strategy — What to try first, what to vary
Success criteria — Target val_bpb or improvement threshold
Safety guardrails — What the agent should NOT change

Example structure for program.md:

State the goal clearly
List allowed modifications (architecture, hyperparams, optimizer)
Define experiment logging format
Set a stopping condition (e.g., "stop after 50 experiments with no improvement")

Interpreting Results

When the user shares experiment logs:

Metric	Good	Bad
val_bpb decreasing	Model is learning	Check for bugs
val_bpb plateaued	May need architecture change	Normal for small models
Training loss \x3C\x3C val loss	Overfitting	Increase regularization
NaN loss	Learning rate too high or instability	Lower LR, check gradients

Quick Commands

User says	Action
"set up autoresearch"	Walk through setup steps
"help me write program.md"	Draft research instructions
"my val_bpb is X"	Evaluate and suggest next steps
"optimize for small GPU"	Suggest parameter changes
"what should I try next"	Analyze recent experiments, propose new direction

Guidelines for Agent

Read-only guidance — suggest changes, let the user apply them
Check GPU capability — ask what GPU they have before recommending parameters
Start simple — recommend TinyStories + DEPTH 4 for first-time users
Explain val_bpb — many users are new to this metric
Refer to autoresearch repo — it's the source of truth for all defaults
No exec — guide only, never run training commands

What This Skill Does NOT Do

Does NOT run training commands or experiments
Does NOT modify train.py or prepare.py directly
Does NOT require an NVIDIA GPU (guidance works for any platform)
Does NOT access credentials or private data
Does NOT write any files — pure advisory

More by TommoT2

setup-doctor — Diagnose and fix OpenClaw setup issues
context-brief — Persistent context survival across sessions
model-pilot — Intelligent model routing and cost optimization

Install the full suite:

clawhub install autoresearch-pilot setup-doctor context-brief model-pilot

安全使用建议

This skill is a textual co‑pilot and does not install or run code by itself, which is good. Before following its instructions: (1) verify the GitHub repository URL and review the repo code (especially scripts like prepare.py/train.py) before running them; (2) confirm what the 'uv' package manager is and inspect any packages it installs; (3) be aware that running training jobs can consume significant GPU/time and may use or generate datasets you should check for licensing/privacy; (4) do not grant the agent remote execution rights or secrets — let it propose changes and run commands only when you explicitly approve and understand them. Overall the skill is coherent and advisory, but exercise normal caution when cloning/running third‑party training code.

功能分析

Type: OpenClaw Skill Name: autoresearch-pilot Version: 1.0.0 The autoresearch-pilot skill is a purely advisory guide designed to help users set up and optimize Andrej Karpathy's 'autoresearch' project. The SKILL.md file explicitly instructs the agent to provide read-only guidance, forbids the execution of training commands or file modifications, and contains no indicators of data exfiltration, malicious execution, or prompt injection.

能力评估

✓ Purpose & Capability

Name/description match the instructions: the skill is a textual guide for setting up and running autoresearch. It does not request unrelated credentials, binaries, or config paths, so the capability footprint is proportionate to the stated purpose.

✓ Instruction Scope

SKILL.md gives step-by-step guidance (clone repo, run commands locally, edit program.md) and explicitly says it will not exec or modify files. It does instruct the user/agent to run commands locally, but does not direct reading of unrelated system files or exfiltration of data.

✓ Install Mechanism

No install spec and no code files — the skill is instruction-only, which minimizes risk from installation or on-disk code.

✓ Credentials

The skill declares no required environment variables or credentials. It sensibly lists local prerequisites (Python, GPU) in prose only — there are no disproportionate secret or config requests.

✓ Persistence & Privilege

always is false and the skill is user-invocable. It does not request persistent privileges or modify other skills or system-wide settings.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install autoresearch-pilot
安装完成后，直接呼叫该 Skill 的名称或使用 /autoresearch-pilot 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

Initial release. Setup guide, program.md writing, result interpretation, small GPU optimization tips.

元数据

Slug autoresearch-pilot

版本 1.0.0

许可证 MIT-0

累计安装 1

当前安装数 1

历史版本数 1

常见问题