Improvement Orchestrator
/install auto-improvement-orchestrator
Improvement Orchestrator
Coordinates the full improvement pipeline: Generator → Discriminator → Evaluator → Executor → Gate.
When to Use
- Run a full improvement cycle on one or more skills
- Coordinate the 5-stage pipeline end-to-end (with optional evaluator)
- Retry failed improvements with trace-aware feedback (Ralph Wiggum loop)
When NOT to Use
- 只想检查 skill 质量评分 → use
improvement-learner - 只想手动给候选打分 → use
improvement-discriminator - 只想改一个文件 → use
improvement-executor - 只想查基准数据 → use
benchmark-store
Pipeline
propose → discriminate → evaluate* → execute → gate (7-layer)
↻ Ralph Wiggum: fail → inject trace → retry (max N)
* evaluate skipped if: no --task-suite, OR low-risk docs/reference/guardrail (adaptive complexity)
Adaptive Complexity Skip: candidates with risk_level=low AND category in (docs, reference, guardrail) skip the evaluator stage entirely. Other categories always run evaluator when --task-suite is provided.
Evaluator→Gate Forwarding: if evaluator produces an artifact, its path is forwarded to gate via --evaluation, enabling RegressionGate to check evaluator verdict.
Baseline Evaluation: when --task-suite is given, orchestrator first runs evaluator in --standalone mode on the current SKILL.md to discover which tasks fail, then injects those failures as --source feedback to the generator.
CLI
python3 scripts/orchestrate.py \
--target /path/to/skill \ # REQUIRED: skill directory or file to improve
--state-root /path/to/state \ # REQUIRED: where artifacts are written
--source feedback.json \ # repeatable: memory/feedback/trace files
--max-retries 3 \ # default 3: Ralph Wiggum retry attempts
--task-suite tasks.yaml \ # enables evaluator stage (real LLM eval)
--eval-mock # evaluator uses mock execution, no claude CLI
| Param | Default | When to change |
|---|---|---|
--target |
(required) | Always set — path to the skill dir to improve |
--state-root |
(required) | Always set — persistent state/artifact directory |
--source |
[] | Add feedback.json, memory files, or prior failure traces |
--max-retries |
3 | Raise to 5 for hard-to-improve skills; lower to 1 for fast iteration |
--task-suite |
None | Provide to enable LLM-based evaluator; omit for docs-only changes |
--eval-mock |
false | Use in CI/testing to skip real claude -p calls |
\x3Cexample> 正确用法: 对一个 skill 运行全流程改进(含 evaluator) $ python3 scripts/orchestrate.py --target /path/to/skill --state-root ./state --task-suite tasks.yaml → 0. Baseline evaluation: 发现 2 个 task 失败,注入 generator → 1. 生成候选 → 2. 多人盲审 → 3. 任务评估 → 4. 执行变更 → 5. 7层门禁 → 失败时自动注入 trace 重试(最多 3 次) → stdout: /path/to/state/pipeline-summary.json \x3C/example>
\x3Canti-example> 错误用法: 只想看评分却用了 orchestrator $ python3 scripts/orchestrate.py --target /path/to/skill --state-root ./state → 会实际执行变更!应该用 improvement-learner 的 self_improve.py \x3C/anti-example>
Error Handling
- 每个 subprocess 有 1200s 超时,超时抛 RuntimeError
- evaluator 失败不中断流程(打印警告继续),但 evaluation_failure_trace 会注入下轮
- gate 返回
revert时自动调用extract_failure_trace()写入traces/trace-{run_id}.json - pipeline-summary.json 最终输出到
{state-root}/pipeline-summary.json
Output
最终输出 pipeline-summary.json:
{"target": "/path/to/skill", "attempts": 2, "max_retries": 3,
"final_decision": "keep", "final_candidate_id": "cand-01-docs",
"final_artifact_path": "/state/receipts/gate-run001-cand-01.json"}
final_decision 取值: keep | revert | reject | pending_promote | no_candidates | no_accepted_candidates
Related Skills
- improvement-generator: Produces candidate proposals (stage 1) — orchestrator calls
propose.py - improvement-discriminator: Multi-reviewer panel scoring (stage 2) — orchestrator calls
score.py - improvement-evaluator: Task suite execution validation (stage 3) — called only when
--task-suiteprovided; baseline failures injected as--source - improvement-executor: Applies changes with backup/rollback (stage 4) — orchestrator calls
execute.py - improvement-gate: 7-layer quality gate (stage 5) — receives
--evaluationartifact when evaluator ran
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install auto-improvement-orchestrator - 安装完成后,直接呼叫该 Skill 的名称或使用
/auto-improvement-orchestrator触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
Improvement Orchestrator 是什么?
当需要一键跑完「生成→评分→评估→执行→门禁」全流程、失败后自动重试、或批量改进多个 skill 时使用。不用于单独评估 skill 质量(用 improvement-learner)或手动打分(用 improvement-discriminator)。 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 98 次。
如何安装 Improvement Orchestrator?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install auto-improvement-orchestrator」即可一键安装,无需额外配置。
Improvement Orchestrator 是免费的吗?
是的,Improvement Orchestrator 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Improvement Orchestrator 支持哪些平台?
Improvement Orchestrator 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Improvement Orchestrator?
由 _silhouette(@lanyasheng)开发并维护,当前版本 v1.0.0。