← 返回 Skills 市场
67
总下载
1
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install auto-research-agent
功能描述
自主AI研究框架 - 基于Karpathy AutoRS理念。AI Agent自主实验→训练→评估→迭代→保留最优。固定时间预算,可比较结果,持续优化。
使用说明 (SKILL.md)
Autonomous Research Framework
自主AI研究框架
灵感来源: Karpathy AutoRS
核心思想: 给AI一个真实的研究环境,让它自主实验、评估、迭代
一、核心设计
研究循环
实验设计 → 代码修改 → 训练运行 → 评估指标 → 结果分析
↑ ↓
←←← 保留/丢弃 → 更新上下文 → 继续下一轮 ←←←
三文件架构
| 文件 | 作用 | 修改权限 |
|---|---|---|
prepare.py |
数据准备、工具函数 | ❌ 不修改 |
train.py |
模型、优化器、训练循环 | ✅ Agent修改 |
program.md |
Agent指令、实验目标 | ✅ 人类修改 |
固定预算
- 时间预算: 每次实验固定时长(避免无限训练)
- 评估指标: 统一指标对比(val_loss, val_bpb 等)
- 可比较性: 相同预算下的结果可直接对比
二、使用方式
启动自主研究
请按照 program.md 的指令开始新一轮实验。
先查看当前 train.py 的状态,然后进行修改并运行。
研究循环
- 读取 program.md — 了解当前研究目标
- 分析 train.py — 理解当前实现
- 设计实验 — 提出假设、修改方案
- 运行训练 — 固定时间预算
- 评估结果 — 与基线对比
- 决定去留 — 保留提升、丢弃退步
- 记录学到的 — 更新记忆/日志
每次实验记录
## 实验 #[N] - [日期时间]
### 假设
[这次要改什么,为什么]
### 修改
[train.py 的改动点]
### 结果
- 评估指标: [数值]
- vs 基线: [+/-%]
### 决定
[保留/丢弃] - [原因]
三、program.md 模板
# Research Program
## 基线状态
- 模型: [描述]
- 优化器: [描述]
- 评估指标: val_bpb = [数值]
## 研究目标
[当前要解决的问题/优化方向]
## 可修改范围
- 模型架构(层数、hidden维度、attention头数)
- 优化器(学习率、beta、权重衰减)
- 训练参数(batch_size、seq_len)
- 正则化(dropout、weight_decay)
## 约束
- 训练时间: 5分钟固定
- 单GPU
- 只修改 train.py
## 当前重点
[Agent根据历史结果自行判断下一个实验方向]
四、评估指标指南
| 指标 | 说明 | 越低/高越好? |
|---|---|---|
| val_bpb | 验证集每字节比特数 | 越低越好 |
| val_loss | 验证损失 | 越低越好 |
| test_acc | 测试准确率 | 越高越好 |
| perplexity | 语言模型困惑度 | 越低越好 |
五、实验策略
探索策略
- 随机扰动 — 小随机变化,找到局部最优
- 梯度方向 — 根据失败经验调整
- 消融实验 — 去掉某部分看影响
- 历史回顾 — 查看过去100次实验的模式
避免重复
- 记录已尝试的(学习率、架构组合等)
- 不重复已证明无效的实验
- 相似实验至少改一个关键变量
六、日志格式
实验日志 (experiments.md)
# 实验日志
## 实验记录
| # | 时间 | 修改 | 指标 | vs基线 | 决定 |
|---|------|------|------|--------|------|
| 1 | 2026-04-17 | 初始基线 | 1.234 | - | 基线 |
| 2 | 2026-04-17 | 学习率 1e-3→5e-4 | 1.189 | -3.6% | ✅保留 |
| 3 | 2026-04-17 | 层数 8→12 | 1.201 | -2.7% | ❌丢弃 |
## 关键发现
- 学习率降低有效
- 层数增加不一定好
七、快速开始
- 查看 program.md — 了解研究目标
- 查看 train.py — 理解当前实现
- 设计第一个实验 — 改什么、为什么
- 运行训练 —
python train.py - 记录结果 — 更新实验日志
- 决定下一步 — 继续或回退
八、Agent 指令
当用户要求开始自主研究时:
- 先读取
program.md了解目标 - 分析
train.py当前状态 - 提出修改假设
- 执行并记录
- 持续迭代
基于 Karpathy AutoRS 理念构建 | OpenClaw Skill
安全使用建议
This skill appears coherent for local, autonomous ML experiments: it lets an agent edit train.py, run python train.py, and record results. Before installing or running it: 1) Ensure you have Python and PyTorch (and GPU drivers if you intend to use CUDA); the skill does not declare or install these. 2) Review and approve any code modifications the agent proposes to train.py — the agent is explicitly allowed to change and execute local code. 3) Run in an isolated environment (virtualenv, container) to limit unintended filesystem or resource effects. 4) Confirm compute/time budgets (program.md mentions 5 minutes) to avoid long or costly runs. 5) The skill requests no secrets and has no network calls in provided code, but always review new/modified code for outbound network activity before execution.
功能分析
Type: OpenClaw Skill
Name: auto-research-agent
Version: 1.0.0
The skill implements an autonomous research framework (inspired by Karpathy's AutoRS) that explicitly instructs the AI agent to iteratively modify and execute Python code (train.py). This creates an autonomous write-and-execute loop which is a high-risk capability (RCE by design), even though it is aligned with the stated purpose of model optimization. No evidence of intentional malice, data exfiltration, or backdoors was found in SKILL.md or the provided Python template.
能力评估
Purpose & Capability
The skill name/description (autonomous research) match the provided files and runtime behavior: reading program.md, allowing the agent to modify train.py, and running experiments. One minor mismatch: the skill does not declare runtime dependencies (train.py imports torch), so it implicitly requires Python and PyTorch (and optionally a GPU) even though the metadata lists no required binaries or install steps.
Instruction Scope
SKILL.md explicitly instructs the agent to read program.md, analyze/modify train.py, run training, evaluate, and log results. Those actions are in-scope for an autonomous research agent. The instructions do not ask the agent to read unrelated files, send data to external endpoints, or access credentials.
Install Mechanism
There is no install spec (instruction-only), which keeps risk low. However, the included train.py requires PyTorch and a Python interpreter; the skill does not declare or install these dependencies. This is a practical omission rather than an evident malicious install step.
Credentials
The skill requests no environment variables, no credentials, and no config paths. That is proportionate to its stated purpose (local experiments). There are no signs of extraneous secret access or credential collection.
Persistence & Privilege
always is false and the skill can be invoked by the model (default). That autonomous invocation is expected for an agent skill and is not combined with broad credential access or privileged system modifications. The skill does allow (and instructs) modifying train.py, which is within its stated scope.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install auto-research-agent - 安装完成后,直接呼叫该 Skill 的名称或使用
/auto-research-agent触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Autonomous Research Agent v1.0.0 released.
- Introduced a research framework inspired by Karpathy AutoRS, enabling agents to autonomously perform experiment→train→evaluate→iterate cycles.
- Supports fixed time budgets for experiments, ensuring results are directly comparable.
- Three-file architecture: data/tools (`prepare.py`), agent-editable code (`train.py`), and human-editable research program (`program.md`).
- Provides templates and guidelines for systematic experimental logging and result evaluation.
- Includes clear instructions and best practices for initiating and recording research workflows.
元数据
常见问题
Auto Research Agent 是什么?
自主AI研究框架 - 基于Karpathy AutoRS理念。AI Agent自主实验→训练→评估→迭代→保留最优。固定时间预算,可比较结果,持续优化。 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 67 次。
如何安装 Auto Research Agent?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install auto-research-agent」即可一键安装,无需额外配置。
Auto Research Agent 是免费的吗?
是的,Auto Research Agent 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Auto Research Agent 支持哪些平台?
Auto Research Agent 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Auto Research Agent?
由 SMS(@smseow001)开发并维护,当前版本 v1.0.0。
推荐 Skills