← 返回 Skills 市场

Improvement Orchestrator

Name: Improvement Orchestrator
Author: lanyasheng

作者 _silhouette · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ✓ 安全检测通过

总下载

当前安装

版本数

在 OpenClaw 中安装

/install auto-improvement-orchestrator

功能描述

当需要一键跑完「生成→评分→评估→执行→门禁」全流程、失败后自动重试、或批量改进多个 skill 时使用。不用于单独评估 skill 质量（用 improvement-learner）或手动打分（用 improvement-discriminator）。

使用说明 (SKILL.md)

Improvement Orchestrator

Coordinates the full improvement pipeline: Generator → Discriminator → Evaluator → Executor → Gate.

When to Use

Run a full improvement cycle on one or more skills
Coordinate the 5-stage pipeline end-to-end (with optional evaluator)
Retry failed improvements with trace-aware feedback (Ralph Wiggum loop)

When NOT to Use

只想检查 skill 质量评分 → use improvement-learner
只想手动给候选打分 → use improvement-discriminator
只想改一个文件 → use improvement-executor
只想查基准数据 → use benchmark-store

Pipeline

propose → discriminate → evaluate* → execute → gate (7-layer)
         ↻ Ralph Wiggum: fail → inject trace → retry (max N)
         * evaluate skipped if: no --task-suite, OR low-risk docs/reference/guardrail (adaptive complexity)

Adaptive Complexity Skip: candidates with risk_level=low AND category in (docs, reference, guardrail) skip the evaluator stage entirely. Other categories always run evaluator when --task-suite is provided.

Evaluator→Gate Forwarding: if evaluator produces an artifact, its path is forwarded to gate via --evaluation, enabling RegressionGate to check evaluator verdict.

Baseline Evaluation: when --task-suite is given, orchestrator first runs evaluator in --standalone mode on the current SKILL.md to discover which tasks fail, then injects those failures as --source feedback to the generator.

CLI

python3 scripts/orchestrate.py \
  --target /path/to/skill \        # REQUIRED: skill directory or file to improve
  --state-root /path/to/state \    # REQUIRED: where artifacts are written
  --source feedback.json \         # repeatable: memory/feedback/trace files
  --max-retries 3 \                # default 3: Ralph Wiggum retry attempts
  --task-suite tasks.yaml \        # enables evaluator stage (real LLM eval)
  --eval-mock                      # evaluator uses mock execution, no claude CLI

Param	Default	When to change
`--target`	(required)	Always set — path to the skill dir to improve
`--state-root`	(required)	Always set — persistent state/artifact directory
`--source`	[]	Add feedback.json, memory files, or prior failure traces
`--max-retries`	3	Raise to 5 for hard-to-improve skills; lower to 1 for fast iteration
`--task-suite`	None	Provide to enable LLM-based evaluator; omit for docs-only changes
`--eval-mock`	false	Use in CI/testing to skip real `claude -p` calls

\x3Cexample> 正确用法: 对一个 skill 运行全流程改进（含 evaluator） $ python3 scripts/orchestrate.py --target /path/to/skill --state-root ./state --task-suite tasks.yaml → 0. Baseline evaluation: 发现 2 个 task 失败，注入 generator → 1. 生成候选 → 2. 多人盲审 → 3. 任务评估 → 4. 执行变更 → 5. 7层门禁 → 失败时自动注入 trace 重试（最多 3 次） → stdout: /path/to/state/pipeline-summary.json \x3C/example>

\x3Canti-example> 错误用法: 只想看评分却用了 orchestrator $ python3 scripts/orchestrate.py --target /path/to/skill --state-root ./state → 会实际执行变更！应该用 improvement-learner 的 self_improve.py \x3C/anti-example>

Error Handling

每个 subprocess 有 1200s 超时，超时抛 RuntimeError
evaluator 失败不中断流程（打印警告继续），但 evaluation_failure_trace 会注入下轮
gate 返回 revert 时自动调用 extract_failure_trace() 写入 traces/trace-{run_id}.json
pipeline-summary.json 最终输出到 {state-root}/pipeline-summary.json

Output

最终输出 pipeline-summary.json：

{"target": "/path/to/skill", "attempts": 2, "max_retries": 3,
 "final_decision": "keep", "final_candidate_id": "cand-01-docs",
 "final_artifact_path": "/state/receipts/gate-run001-cand-01.json"}

Related Skills

improvement-generator: Produces candidate proposals (stage 1) — orchestrator calls propose.py
improvement-discriminator: Multi-reviewer panel scoring (stage 2) — orchestrator calls score.py
improvement-evaluator: Task suite execution validation (stage 3) — called only when --task-suite provided; baseline failures injected as --source
improvement-executor: Applies changes with backup/rollback (stage 4) — orchestrator calls execute.py
improvement-gate: 7-layer quality gate (stage 5) — receives --evaluation artifact when evaluator ran

安全使用建议

This skill is an on‑repo pipeline orchestrator: it will run local scripts (propose/score/evaluate/execute/gate), create state artifacts and backups, and may apply changes to files under the --target you provide. Before running: 1) Inspect the scripts it invokes in your repository (improvement-generator/discriminator/evaluator/executor/gate) to ensure they do only what you expect; 2) Run first with a disposable --state-root and use --eval-mock (avoid real LLM CLI calls) to observe behavior; 3) Backup your target skill or point --target at a copy if you are not ready for automatic modifications; 4) Be aware that the evaluator or other invoked scripts may require separate API keys or env vars — the orchestrator itself does not request credentials. If you want to be extra cautious, run the included tests and review the executor's logic to confirm it only performs allowed actions (append_markdown_section, create_file) for low‑risk categories.

功能分析

Type: OpenClaw Skill Name: auto-improvement-orchestrator Version: 1.0.0 The bundle is a legitimate orchestration tool designed to automate the improvement lifecycle of OpenClaw skills. The core logic in `scripts/orchestrate.py` coordinates a five-stage pipeline (Propose, Discriminate, Evaluate, Execute, Gate) by executing local Python scripts via `subprocess.run`. The implementation includes robust state management, automated retries (the 'Ralph Wiggum' loop), and extensive documentation in the `references/` directory detailing safety guardrails and rollbacks. No evidence of data exfiltration, malicious prompt injection, or unauthorized remote execution was found; the tool's behavior is strictly aligned with its stated purpose of skill optimization.

能力评估

✓ Purpose & Capability

The name/description (orchestrating a 5‑stage improvement pipeline) matches the actual behavior: the script dispatches Generator→Discriminator→Evaluator→Executor→Gate and writes state/artifacts and backups. All declared requirements (none) are appropriate for a local orchestrator that runs other local scripts.

ℹ Instruction Scope

SKILL.md and scripts explicitly instruct the agent to run local subprocesses, read feedback sources, and apply changes to the target skill (append markdown sections, create files) with backups/rollback. This is expected for an orchestrator, but it means the skill will read/write arbitrary files under the provided --target and --state-root and can forward failure traces into subsequent runs. The orchestrator itself does not call external endpoints, but it invokes other scripts (e.g., evaluator) which may call LLM CLIs or network services — review those scripts before use.

✓ Install Mechanism

Instruction-only (no install spec). The bundle includes orchestration code only; nothing is downloaded or extracted from external URLs. Lowest install risk.

ℹ Credentials

The skill declares no required env vars or credentials, which is coherent. Caveat: the orchestrator spawns other local scripts (generator/discriminator/evaluator/executor/gate) that are expected to live in the repo; those sub-scripts may themselves require API keys or credentials (e.g., for LLMs) even though the orchestrator doesn't declare them. Confirm what the invoked scripts expect before running with real task suites.

✓ Persistence & Privilege

always=false and no special platform privileges. The orchestrator writes persistent artifacts and backups to the user-supplied --state-root (normal for its purpose). It does not modify other skills' configuration beyond running the standard executor workflow for the provided --target, but it will apply changes to the target path (intended behavior).

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install auto-improvement-orchestrator
安装完成后，直接呼叫该 Skill 的名称或使用 /auto-improvement-orchestrator 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

Initial release: closed-loop skill improvement pipeline

元数据

Slug auto-improvement-orchestrator

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题