Description

Multi-agent auto-evolution system with hybrid mode — orchestrate review-execute-audit loops with 4 roles (Coordinator, Reviewer, Executor, Auditor). Supports...

README (SKILL.md)

auto-evolution

Name: Auto Evolution (Hybrid Mode)
Author: cjboy007

Category: Agent Orchestration / Meta-Skill Version: 0.7.0

Description

Multi-agent auto-evolution system — a coordinator agent drives an autonomous review → execute → audit loop by spawning specialized sub-agents for each role.

This is a meta-skill: it doesn't handle business logic. It orchestrates the loop so complex tasks get completed autonomously with dual quality gates (pre-execution review + post-execution audit).

Architecture (4 Roles)

Role	Responsibility	When Spawned	Recommended Model
Coordinator	Drives the loop, updates task state, spawns sub-agents	Always (heartbeat/cron)	Any (cost-efficient)
Reviewer	Pre-execution review, generates detailed subtasks	Complex tasks only	Strong (Sonnet/GPT-5.4)
Executor	Implements one subtask, runs verification	After review approves	Cost-effective (Qwen3.5-Plus)
Auditor	Post-execution audit, decides pass/retry	After execution completes	Strong (Sonnet/GPT-5.4)

Why 4 roles?

Reviewer and Auditor are both quality gates but serve different purposes
Reviewer ensures the plan is sound before work starts
Auditor verifies the result matches the plan after work completes
Executor is pure labor — follows instructions, no judgment needed

Cost control: Only Reviewer and Auditor need strong models. Coordinator and Executor can use cheap models.

🔄 Hybrid Mode (v2.0)

Task Complexity Assessment (5 dimensions, 1-5 points each):

Dimension	1 point	3 points	5 points
Code Lines	\x3C100	200-500	>1000
Files	1-2	5-10	>20
Risk	Docs/Test	Feature improvement	Architecture change
Dependencies	None	3-5	Cross-system
Innovation	Routine fix	Feature enhancement	New feature

Task Classification:

Total Score	Task Type	Subtask Mode	Flow
5-10	Simple	Manual	Executor only
11-17	Medium	Manual (recommended) or Auto	Optional Reviewer
18-25	Complex	Auto (required)	Reviewer → Executor → Auditor

Usage:

# Create task (interactive)
node scripts/create-task.js

# Start Reviewer (complex tasks only)
node scripts/start-reviewer.js \x3Ctask-id>

Core Modules

File	Purpose
`scripts/heartbeat-coordinator.js`	Coordinator: scan tasks → spawn Reviewer/Executor/Auditor
`scripts/monitor.js`	Monitor: detect stuck tasks, clean orphaned locks
`scripts/pack-skill.js`	Package completed tasks → skill directories
`config/task-schema.json`	Task file JSON Schema

Setup

1. Initialize workspace

mkdir -p evolution/tasks evolution/archive evolution/test-results

2. Create a task

cp skills/auto-evolution/references/task-example.json evolution/tasks/task-001.json
# Edit with your goal and subtasks

3. Configure the coordinator

Option A: Heartbeat (recommended — in your agent's HEARTBEAT.md)

## Evolution Loop
1. Run `node skills/auto-evolution/scripts/heartbeat-coordinator.js`
2. Parse output: if phase=review → spawn Reviewer sub-agent
3. Apply review → if phase=execute → spawn Executor sub-agent
4. Apply execution → if phase=audit → spawn Auditor sub-agent
5. Apply audit → done for this tick

Option B: Cron

openclaw cron add --agent \x3Cyour-agent> \
  --name "evolution-coordinator" \
  --every 5m \
  --session isolated \
  --timeout-seconds 300 \
  --message "Evolution heartbeat: scan and process tasks."

4. (Optional) Configure the monitor

openclaw cron add --agent \x3Cany-agent> \
  --name "evolution-monitor" \
  --every 10m \
  --session isolated \
  --timeout-seconds 120 \
  --message "Run: node skills/auto-evolution/scripts/monitor.js"

5. Configure models (optional)

Edit evolution/config/models.json to customize which models are used for each role:

{
  "roles": {
    "reviewer": "google/gemini-3.1-pro",
    "executor": "aiberm/gpt-5.4",
    "auditor": "google/gemini-3.1-pro",
    "coordinator": "bailian/qwen3.5-plus"
  }
}

Default: Scripts read from this config file. No environment variables needed.

6. Environment variables (optional)

export OPENCLAW_WORKSPACE=/path/to/workspace
export EVOLUTION_TASKS_DIR=/path/to/tasks

How It Works

Full Loop

Coordinator heartbeat
  → finds task (priority: reviewed > executing > pending)
  → if pending: spawn Reviewer → reviewed
  → if reviewed: spawn Executor → executing
  → if executing: spawn Auditor → pending (next) or completed ✅

State Machine

pending → reviewed → executing → pending (next subtask)
                         → completed (all done)
                         → packaged ✅

Key Rules

One subtask per iteration — keeps cycles fast and reviewable
Dual quality gates — Reviewer (before) + Auditor (after)
Only mark completed when all subtasks done
If Reviewer/Auditor API fails → wait and retry next heartbeat
Monitor auto-resets tasks stuck > 10 minutes

Task File Format

See references/task-example.json for a complete example.

Required fields:

{
  "task_id": "task-001",
  "status": "pending",
  "goal": "What to build",
  "current_iteration": 0,
  "max_iterations": 10,
  "context": {
    "subtasks": ["Step 1", "Step 2", "Step 3"]
  },
  "history": []
}

CLI Usage

# Scan and output next phase prompt
node scripts/heartbeat-coordinator.js

# Apply review result
node scripts/heartbeat-coordinator.js apply-review task-001.json review.txt

# Apply execution result
node scripts/heartbeat-coordinator.js apply-exec task-001.json exec.txt

# Apply audit result
node scripts/heartbeat-coordinator.js apply-audit task-001.json audit.txt

# Run monitor
node scripts/monitor.js

# Package completed tasks
node scripts/pack-skill.js

Design Philosophy

4-role architecture — Coordinator drives, Reviewer/Executor/Auditor specialize
Dual quality gates — Review before, audit after — never skip either
Model-agnostic — swap any model for any role
One subtask per tick — predictable, reviewable, won't timeout
Self-healing — monitor detects and fixes stuck states
Cost-efficient — strong models only where judgment matters (Reviewer, Auditor)

Usage Guidance

This skill implements a multi-agent orchestration system and will read, write, move, and delete files in your OpenClaw workspace (tasks, locks, archive, skills). Before installing or running: 1) Run it in an isolated test workspace (set OPENCLAW_WORKSPACE to a disposable directory) to observe behavior. 2) Inspect and test create-task, heartbeat-coordinator, monitor, pack-skill, and start-reviewer in isolation — note pack-skill deletes task files after archiving. 3) Fix or align TASKS_DIR resolution if you expect tasks in a single location (create-task uses __dirname relative path while other scripts use WORKSPACE-derived paths). 4) Remove or review any use of child_process.execSync and ensure sessions_spawn integration is implemented safely (avoid executing untrusted strings). 5) Backup any existing tasks/skills before first run, and prefer running with least privilege and in an isolated agent/session. If you need higher assurance, ask the author for provenance (homepage/repo) and a unified versioned release.

Capability Analysis

Type: OpenClaw Skill Name: auto-evolution Version: 2.0.0 The 'auto-evolution' skill bundle is a sophisticated multi-agent orchestration framework designed to automate complex tasks through a structured review-execute-audit loop. The core logic is implemented in Node.js scripts such as `heartbeat-coordinator.js`, which manages task transitions and generates prompts for specialized sub-agent roles (Reviewer, Executor, Auditor), and `monitor.js`, which handles task timeouts and lock cleanup. While the system facilitates autonomous code execution and file management (e.g., in `pack-skill.js`), the behavior is entirely consistent with the stated purpose of a meta-skill for agentic workflow automation, and no evidence of malicious intent, data exfiltration, or unauthorized access was found.

Capability Assessment

ℹ Purpose & Capability

The skill's name/description (multi-agent orchestration with Coordinator/Reviewer/Executor/Auditor) aligns with the included scripts which implement heartbeat, monitor, reviewer starter, packer, and task creation. However there are manifest/version mismatches (SKILL.md v0.7.0 vs registry 2.0.0 vs package.json 0.5.7) and README install instructions that reference an external GitHub repo; these inconsistencies reduce confidence in provenance but do not by themselves contradict the claimed purpose.

⚠ Instruction Scope

Scripts read and write many workspace files and can delete/move task files (pack-skill deletes tasks after archiving) and remove lock files (monitor). create-task.js writes tasks relative to the skill directory (path.join(__dirname,'../tasks')), while heartbeat/monitor/pack-skill resolve TASKS_DIR from OPENCLAW_WORKSPACE or defaults under ~/.openclaw/... — this inconsistent TASKS_DIR handling can cause tasks to be created in one location and processed (or not) in another. start-reviewer.js includes a child_process execSync import and pseudocode describing sessions_spawn but does not implement secure spawning; reviewers are expected to run as sub-agents but the integration is manual/underspecified. Overall the runtime instructions and scripts have broad discretion to modify the user's workspace and perform destructive actions (unlink), which is proportional to packaging tasks but should be explicit to users.

✓ Install Mechanism

No install spec and no external downloads; all behavior is from included scripts (Node.js). That lowers supply-chain concern — nothing is fetched from unknown URLs during install. However, running the scripts will perform filesystem changes.

ℹ Credentials

The registry metadata declares no required environment variables or credentials, and SKILL.md documents optional env vars (OPENCLAW_WORKSPACE, EVOLUTION_TASKS_DIR, EVOLUTION_SKILLS_DIR, EVOLUTION_ARCHIVE_DIR). This is reasonable, but the scripts rely on these vars (or defaults) to determine filesystem targets — the absence of explicit required env vars in metadata is acceptable but users should be aware the scripts will operate on paths derived from these variables (and defaults under ~/.openclaw). No network credentials are requested by the skill files.

⚠ Persistence & Privilege

always is false and model invocation is allowed (normal). However pack-skill.js will move completed tasks into an archive and will reference and potentially write into the user's SKILLS_DIR (e.g., update skill directories). Monitor scripts will remove lock files and reset task states. These behaviors modify the user's workspace and can delete/move files; combined with autonomous use (agent can spawn scripts via heartbeat/cron), this elevates the practical privilege and warrants running in an isolated/test workspace first.

Version History

v2.0.0

v2.0.0 - 混合模式实现：支持简单任务手动 Subtask 和复杂任务自动 Subtask 生成。新增 5 维度复杂度评分模型、任务创建工具、Reviewer 启动工具。完全向后兼容。

v0.7.0

Model configuration via roles instead of hardcoded IDs. Add config/models.json for centralized model management. Easier model swapping and A/B testing.

v0.6.0

Add Auditor role for dual quality gates: Reviewer (pre-exec) + Auditor (post-exec). Coordinator now spawns 3 sub-agent roles. Full loop: review → execute → audit.

v0.5.7

Initial public release: multi-agent orchestration with review-execute-audit loop. Model-agnostic, self-healing, auto-packaging.

Metadata

Slug auto-evolution

Version 2.0.0

License MIT-0

All-time Installs 3

Active Installs 3

Total Versions 4

Frequently Asked Questions

What is Auto Evolution (Hybrid Mode)?

Multi-agent auto-evolution system with hybrid mode — orchestrate review-execute-audit loops with 4 roles (Coordinator, Reviewer, Executor, Auditor). Supports... It is an AI Agent Skill for Claude Code / OpenClaw, with 147 downloads so far.

How do I install Auto Evolution (Hybrid Mode)?

Run "/install auto-evolution" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Auto Evolution (Hybrid Mode) free?

Yes, Auto Evolution (Hybrid Mode) is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Auto Evolution (Hybrid Mode) support?

Auto Evolution (Hybrid Mode) is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Auto Evolution (Hybrid Mode)?

It is built and maintained by Jaden's built a claw (@cjboy007); the current version is v2.0.0.

More Skills

Auto Evolution (Hybrid Mode)

auto-evolution

Description

Architecture (4 Roles)

🔄 Hybrid Mode (v2.0)

Core Modules

Setup

1. Initialize workspace

2. Create a task

3. Configure the coordinator

4. (Optional) Configure the monitor

5. Configure models (optional)

6. Environment variables (optional)

How It Works

Full Loop

State Machine

Key Rules

Task File Format

CLI Usage

Design Philosophy

What is Auto Evolution (Hybrid Mode)?

How do I install Auto Evolution (Hybrid Mode)?

Is Auto Evolution (Hybrid Mode) free?

Which platforms does Auto Evolution (Hybrid Mode) support?

Who created Auto Evolution (Hybrid Mode)?

💬 Comments