功能描述

Karpathy-style autonomous self-research loop for AI agents. The agent proposes a change to its own SOUL.md, scripts, or behavior, tests it, evaluates the res...

使用说明 (SKILL.md)

agent-autoresearch

Name: Agent Autoresearch
Author: duolahypercho

Any agent can run this. The experiment is always: change something → measure it → keep what works.

The Core Idea

Karpathy's insight: give an agent a fixed time budget, let it modify one file, measure if things got better, keep or discard, repeat.

Applied to agents: your workspace is train.py. Your SOUL.md, scripts, and skills are the experiment substrate.

PROPOSE → IMPLEMENT → MEASURE → KEEP/KILL → INTEGRATE → REPEAT

You are not just optimizing content. You are optimizing the agent itself.

What Can Be Mutated

The agent can propose changes to any file it owns:

Category	Examples
Behavior	New response patterns, different tone, new check routines
Workflow	New scripts, automations, cron jobs, notification flows
Memory	Updated MEMORY.md entries, new daily conventions
Identity	Revised SOUL.md directives, new operational rules
Skills	New skill installations, skill configurations
Quality	New validation logic, error handling patterns

The agent cannot mutate: safety rules, constitution, security boundaries, or files it doesn't own.

Project Structure

agent-autoresearch/
├── SKILL.md                    ← you are here
├── program.md                  ← 🧠 the experiment agent's instructions
├── prepare.py                  ← establish baseline metrics
├── evolve.py                   ← integrate KEEP verdict into agent files
├── analyze.py                  ← compute verdict from measurements
├── baseline.json               ← current agent baseline (performance + strategy)
├── results.tsv                 ← all experiment results (append-only log)
└── experiments/
    ├── meta.json               ← experiment state (next_exp_id, kill_streak)
    ├── active.md               ← one active experiment at a time
    └── archive/                ← completed experiments

🚀 Quick Start

# 1. Establish baseline (measure current agent performance)
python3 prepare.py --metric task_completion_rate --baseline 0.75

# 2. Read the experiment brief
cat program.md

# 3. Start the experiment loop
#    Agent reads program.md, proposes a self-improvement, implements it,
#    measures results, and executes KEEP/KILL verdict.

# Check current state
python3 prepare.py --status

Baseline Metrics

Track what matters for the agent's mission. Examples:

Mission	Metric	How to Measure
Task completion	`task_completion_rate`	% tasks completed vs assigned
Response quality	`output_quality_score`	Human rating 1-10 or diff-based
Speed	`avg_response_time_s`	Seconds per response
Self-improvement	`learnings_logged`	Entries added to MEMORY.md per week
Autonomy	`escalations_to_human`	Times human was unnecessarily interrupted

Establish baseline with ≥ 10 measurements before running experiments.

Verdict Logic

improvement = (experiment_score - baseline_score) / baseline_score

≥ +10%  → KEEP  (integrate the change into the agent)
≤ -10%  → KILL  (discard, revert to previous state)
-10% to +10% → MODIFY (extend evaluation or treat as KILL)

For quality/rating metrics (higher is better): above thresholds apply. For cost/latency metrics (lower is better): flip the sign in calculation.

Key Rules

❌ One mutation at a time — test one change per experiment
❌ No baseline — need ≥10 measurements before experimenting
❌ Vibes verdicts — use actual measurements
❌ Mutate safety/constitution files — never
❌ Kill streak ≥ 3 → pause and wait for human review
❌ Infinite MODIFY — max one extension
❌ Revert a KEEP — only a newer KEEP overrides

Commands

Command	What
`python3 prepare.py --status`	Check current state
`python3 prepare.py --metric X --baseline Y`	Establish baseline
`python3 analyze.py experiments/active.md --auto`	Compute verdict
`python3 evolve.py experiments/active.md`	Execute KEEP verdict
`python3 evolve.py experiments/active.md --kill`	Execute KILL verdict

Security

Agents can only mutate files within their own workspace
Safety rules and constitution are always excluded from mutation
External API calls require human approval
Destructive operations (rm, git reset --hard) require explicit confirmation

安全使用建议

This package implements an autonomous self-experimentation loop and is broadly coherent with its description, but it requires careful review and hardening before you run it on any real agent workspace. Before installing or running: - Audit evolve.py's restore/update logic. The revert/update code copies backup filenames directly to the workspace root (dst = filename) and extracts file paths from markdown. Ensure filenames are validated (no absolute paths, no '..' segments) and that only an explicit allowlist of files (e.g., SOUL.md, specific scripts) may be modified. - Require a human approval step (enforced by code) for any mutation that would: change cron jobs, install/uninstall skills, call external APIs, or touch files outside a small, explicit workspace subdirectory. - Run the tool first in an isolated test workspace or container with no sensitive files, credentials, or access to external services. Confirm dry-run behavior and use --dry-run where available. - Use filesystem permissions or OS-level sandboxing to prevent the script from overwriting system or other-agent files. - If you plan to allow autonomous runs, add code-level assurances: sanitize paths, verify affected_files are subpaths of an allowed directory, deny changes to any files matching 'constitution', 'IDENTITY', 'credentials', or other sensitive names, and log human approvals to an auditable file. What would change this assessment: explicit path-sanitization and allowlisting of modifiable files in the code, and an enforced human-approval mechanism for any external API/cross-workspace effects would reduce the risk and could make the skill 'benign'. Without those, treat it as suspicious and run only in tightly controlled, isolated environments.

功能分析

Type: OpenClaw Skill Name: agent-autoresearch Version: 1.2.0 The skill bundle implements an autonomous 'self-research' loop that allows an AI agent to modify its own source code, operational instructions (SOUL.md), and workflows (evolve.py, program.md). While the stated intent is self-optimization, the capability for an agent to perform arbitrary file mutations and potentially establish persistence via cron jobs (mentioned in SKILL.md) represents a high-risk surface. There is no evidence of intentional malice or data exfiltration, but the framework provides the necessary primitives for an agent to bypass its original constraints if it generates or is prompted with harmful mutations.

能力评估

✓ Purpose & Capability

Name, description, SKILL.md, and the included scripts (prepare.py, analyze.py, evolve.py, plus helpers) are coherent: they implement a propose→implement→measure→keep/kill experiment loop that reads/writes baseline, meta, results, and experiment files. No unrelated environment variables, binaries, or external installs are requested.

⚠ Instruction Scope

SKILL.md/program.md instructs the agent to autonomously mutate its SOUL.md, scripts, and workspace files and to 'do NOT pause to ask permission' once running. Although the prose claims safety rails (do not mutate constitution/safety files, require human approval for external API calls), the scripts themselves do not enforce these constraints. For example, evolve.py restores files from experiments/backups/EXP-XXX/ by copying backup filenames directly to dst = filename without sanitizing or validating paths, and affected-files parsing pulls paths out of markdown links. That combination creates a risk that an experiment could stage arbitrary paths (including relative traversal) and cause the script to overwrite files outside the intended scope. The instructions also encourage modifying cron jobs, installing skills, and other actions that can have side effects outside the workspace; there is no programmatic check to block or require explicit human approval for such mutations.

✓ Install Mechanism

No install spec or external downloads; this is an instruction-plus-source bundle. Nothing is being pulled from the network during install, which lowers supply-chain risk.

✓ Credentials

The skill requests no environment variables, credentials, or config paths. That is proportionate to the stated local-workspace experimentation purpose. However, the code can mutate files and create cron jobs/skill changes per the docs, and there is no enforcement that external credentials or secrets are not modified if present in files.

⚠ Persistence & Privilege

always:false (not force-included) and default model invocation are unchanged, but the skill is explicitly designed to run a continuous autonomous loop and to integrate KEEP verdicts into agent files (SOUL.md, scripts, baseline.json). Because there are no strong programmatic safeguards in the code to prevent unsafe mutations or to enforce human approvals, an autonomously-invoked loop has a significant blast radius: it can write persistent changes to the agent workspace and potentially modify startup/crontab-like definitions if the agent implements them. This combination — autonomy + self-modification + lack of sanitization — is a notable risk.

版本历史

v1.2.0

v1.2: Full rebrand — agent-general self-research loop, not content-specific. Any agent can now use this to evolve its own SOUL.md, scripts, memory, and workflows via the Karpathy experiment pattern.

v1.1.0

v1.1: Full rewrite — Karpathy-style minimal structure with program.md agent brief, prepare.py + evolve.py, TSV results log, kill streak circuit breaker, cleaner file layout

v1.0.0

Initial release: Karpathy-style autoresearch loop for agents. Autonomous experiment loop (act → evaluate → verdict → evolve), KEEP/MODIFY/KILL verdicts, champion playbook versioning, memory management, reference Python scripts.

元数据

Slug agent-autoresearch

版本 1.2.0

许可证 MIT-0

累计安装 1

当前安装数 1

历史版本数 3

常见问题