← 返回 Skills 市场
lanyasheng

Improvement Orchestrator

作者 _silhouette · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ 安全检测通过
98
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install auto-improvement-orchestrator
功能描述
当需要一键跑完「生成→评分→评估→执行→门禁」全流程、失败后自动重试、或批量改进多个 skill 时使用。不用于单独评估 skill 质量(用 improvement-learner)或手动打分(用 improvement-discriminator)。
使用说明 (SKILL.md)

Improvement Orchestrator

Coordinates the full improvement pipeline: Generator → Discriminator → Evaluator → Executor → Gate.

When to Use

  • Run a full improvement cycle on one or more skills
  • Coordinate the 5-stage pipeline end-to-end (with optional evaluator)
  • Retry failed improvements with trace-aware feedback (Ralph Wiggum loop)

When NOT to Use

  • 只想检查 skill 质量评分 → use improvement-learner
  • 只想手动给候选打分 → use improvement-discriminator
  • 只想改一个文件 → use improvement-executor
  • 只想查基准数据 → use benchmark-store

Pipeline

propose → discriminate → evaluate* → execute → gate (7-layer)
         ↻ Ralph Wiggum: fail → inject trace → retry (max N)
         * evaluate skipped if: no --task-suite, OR low-risk docs/reference/guardrail (adaptive complexity)

Adaptive Complexity Skip: candidates with risk_level=low AND category in (docs, reference, guardrail) skip the evaluator stage entirely. Other categories always run evaluator when --task-suite is provided.

Evaluator→Gate Forwarding: if evaluator produces an artifact, its path is forwarded to gate via --evaluation, enabling RegressionGate to check evaluator verdict.

Baseline Evaluation: when --task-suite is given, orchestrator first runs evaluator in --standalone mode on the current SKILL.md to discover which tasks fail, then injects those failures as --source feedback to the generator.

CLI

python3 scripts/orchestrate.py \
  --target /path/to/skill \        # REQUIRED: skill directory or file to improve
  --state-root /path/to/state \    # REQUIRED: where artifacts are written
  --source feedback.json \         # repeatable: memory/feedback/trace files
  --max-retries 3 \                # default 3: Ralph Wiggum retry attempts
  --task-suite tasks.yaml \        # enables evaluator stage (real LLM eval)
  --eval-mock                      # evaluator uses mock execution, no claude CLI
Param Default When to change
--target (required) Always set — path to the skill dir to improve
--state-root (required) Always set — persistent state/artifact directory
--source [] Add feedback.json, memory files, or prior failure traces
--max-retries 3 Raise to 5 for hard-to-improve skills; lower to 1 for fast iteration
--task-suite None Provide to enable LLM-based evaluator; omit for docs-only changes
--eval-mock false Use in CI/testing to skip real claude -p calls

\x3Cexample> 正确用法: 对一个 skill 运行全流程改进(含 evaluator) $ python3 scripts/orchestrate.py --target /path/to/skill --state-root ./state --task-suite tasks.yaml → 0. Baseline evaluation: 发现 2 个 task 失败,注入 generator → 1. 生成候选 → 2. 多人盲审 → 3. 任务评估 → 4. 执行变更 → 5. 7层门禁 → 失败时自动注入 trace 重试(最多 3 次) → stdout: /path/to/state/pipeline-summary.json \x3C/example>

\x3Canti-example> 错误用法: 只想看评分却用了 orchestrator $ python3 scripts/orchestrate.py --target /path/to/skill --state-root ./state → 会实际执行变更!应该用 improvement-learner 的 self_improve.py \x3C/anti-example>

Error Handling

  • 每个 subprocess 有 1200s 超时,超时抛 RuntimeError
  • evaluator 失败不中断流程(打印警告继续),但 evaluation_failure_trace 会注入下轮
  • gate 返回 revert 时自动调用 extract_failure_trace() 写入 traces/trace-{run_id}.json
  • pipeline-summary.json 最终输出到 {state-root}/pipeline-summary.json

Output

最终输出 pipeline-summary.json

{"target": "/path/to/skill", "attempts": 2, "max_retries": 3,
 "final_decision": "keep", "final_candidate_id": "cand-01-docs",
 "final_artifact_path": "/state/receipts/gate-run001-cand-01.json"}

final_decision 取值: keep | revert | reject | pending_promote | no_candidates | no_accepted_candidates

Related Skills

  • improvement-generator: Produces candidate proposals (stage 1) — orchestrator calls propose.py
  • improvement-discriminator: Multi-reviewer panel scoring (stage 2) — orchestrator calls score.py
  • improvement-evaluator: Task suite execution validation (stage 3) — called only when --task-suite provided; baseline failures injected as --source
  • improvement-executor: Applies changes with backup/rollback (stage 4) — orchestrator calls execute.py
  • improvement-gate: 7-layer quality gate (stage 5) — receives --evaluation artifact when evaluator ran
安全使用建议
This skill is an on‑repo pipeline orchestrator: it will run local scripts (propose/score/evaluate/execute/gate), create state artifacts and backups, and may apply changes to files under the --target you provide. Before running: 1) Inspect the scripts it invokes in your repository (improvement-generator/discriminator/evaluator/executor/gate) to ensure they do only what you expect; 2) Run first with a disposable --state-root and use --eval-mock (avoid real LLM CLI calls) to observe behavior; 3) Backup your target skill or point --target at a copy if you are not ready for automatic modifications; 4) Be aware that the evaluator or other invoked scripts may require separate API keys or env vars — the orchestrator itself does not request credentials. If you want to be extra cautious, run the included tests and review the executor's logic to confirm it only performs allowed actions (append_markdown_section, create_file) for low‑risk categories.
功能分析
Type: OpenClaw Skill Name: auto-improvement-orchestrator Version: 1.0.0 The bundle is a legitimate orchestration tool designed to automate the improvement lifecycle of OpenClaw skills. The core logic in `scripts/orchestrate.py` coordinates a five-stage pipeline (Propose, Discriminate, Evaluate, Execute, Gate) by executing local Python scripts via `subprocess.run`. The implementation includes robust state management, automated retries (the 'Ralph Wiggum' loop), and extensive documentation in the `references/` directory detailing safety guardrails and rollbacks. No evidence of data exfiltration, malicious prompt injection, or unauthorized remote execution was found; the tool's behavior is strictly aligned with its stated purpose of skill optimization.
能力评估
Purpose & Capability
The name/description (orchestrating a 5‑stage improvement pipeline) matches the actual behavior: the script dispatches Generator→Discriminator→Evaluator→Executor→Gate and writes state/artifacts and backups. All declared requirements (none) are appropriate for a local orchestrator that runs other local scripts.
Instruction Scope
SKILL.md and scripts explicitly instruct the agent to run local subprocesses, read feedback sources, and apply changes to the target skill (append markdown sections, create files) with backups/rollback. This is expected for an orchestrator, but it means the skill will read/write arbitrary files under the provided --target and --state-root and can forward failure traces into subsequent runs. The orchestrator itself does not call external endpoints, but it invokes other scripts (e.g., evaluator) which may call LLM CLIs or network services — review those scripts before use.
Install Mechanism
Instruction-only (no install spec). The bundle includes orchestration code only; nothing is downloaded or extracted from external URLs. Lowest install risk.
Credentials
The skill declares no required env vars or credentials, which is coherent. Caveat: the orchestrator spawns other local scripts (generator/discriminator/evaluator/executor/gate) that are expected to live in the repo; those sub-scripts may themselves require API keys or credentials (e.g., for LLMs) even though the orchestrator doesn't declare them. Confirm what the invoked scripts expect before running with real task suites.
Persistence & Privilege
always=false and no special platform privileges. The orchestrator writes persistent artifacts and backups to the user-supplied --state-root (normal for its purpose). It does not modify other skills' configuration beyond running the standard executor workflow for the provided --target, but it will apply changes to the target path (intended behavior).
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install auto-improvement-orchestrator
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /auto-improvement-orchestrator 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release: closed-loop skill improvement pipeline
元数据
Slug auto-improvement-orchestrator
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

Improvement Orchestrator 是什么?

当需要一键跑完「生成→评分→评估→执行→门禁」全流程、失败后自动重试、或批量改进多个 skill 时使用。不用于单独评估 skill 质量(用 improvement-learner)或手动打分(用 improvement-discriminator)。 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 98 次。

如何安装 Improvement Orchestrator?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install auto-improvement-orchestrator」即可一键安装,无需额外配置。

Improvement Orchestrator 是免费的吗?

是的,Improvement Orchestrator 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Improvement Orchestrator 支持哪些平台?

Improvement Orchestrator 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Improvement Orchestrator?

由 _silhouette(@lanyasheng)开发并维护,当前版本 v1.0.0。

💬 留言讨论