← 返回 Skills 市场
lanyasheng

Auto Improvement Orchestrator Skill

作者 _silhouette · GitHub ↗ · v1.0.3 · MIT-0
cross-platform ⚠ suspicious
118
总下载
0
收藏
0
当前安装
4
版本数
在 OpenClaw 中安装
/install auto-improvement-orchestrator-skill
功能描述
Skill 自动评估和改进管线。9 维结构评分(含 LLM-as-Judge)、4 角色加权、 类别修正系数(tool/knowledge/orchestration/rule)、Pareto front 回归保护 (security 2%/efficiency 10%/其他 5%)、trace-aware 失败...
使用说明 (SKILL.md)

Auto-Improvement Orchestrator

从评估到改进到验证的完整管线,让 Skill 自动变好。

When to Use

  • 评估一个 skill 的质量(9 维打分 + 4 角色评审)
  • 自动改进 SKILL.md(生成候选→打分→执行→门禁)
  • 批量改进多个 skill(autoloop 连续运行)
  • 从 Claude Code 会话日志提取用户反馈信号
  • 对比 skill 改进前后的 Pareto front

When NOT to Use

  • 手动编辑单个 SKILL.md → 直接改文件
  • Agent 执行可靠性 → 用 execution-harness(独立仓库)
  • 纯文档生成 → 用 doc-gen
  • Prompt 优化(token 级)→ 用 DSPy

Quick Start

# 打分
python3 skills/improvement-learner/scripts/self_improve.py \
  --skill-path /your/skill --max-iterations 1

# 自动改进 5 轮
python3 skills/improvement-learner/scripts/self_improve.py \
  --skill-path /your/skill --max-iterations 5

# 从会话日志提取反馈
python3 skills/session-feedback-analyzer/scripts/analyze.py \
  --output feedback.jsonl

Architecture

11 个管线 skill 分三层:

  • 评估层: learner(9 维结构评分)、evaluator(执行测试)、session-feedback(用户反馈)
  • 改进层: generator → discriminator → evaluator → executor → gate(6 层门禁)
  • 控制层: autoloop(连续运行)、benchmark-store(Pareto front)、execution-harness(独立仓库)

辅助工具: skill-forge(造 skill)、skill-distill(合并 skill) 验证目标: prompt-hardening、deslop

Related

安全使用建议
What to consider before installing/running this skill: - Missing declarations: The registry says 'no required env vars / config paths', but the code and docs expect LLM credentials (e.g., ANTHROPIC_API_KEY or a 'claude -p' CLI) and access to your local Claude session logs (~/.claude/projects). Ask the publisher to document required secrets and file access clearly before use. - Sensitive reads: The session-feedback analyzer reads ~/.claude/projects JSONL files. Those logs may contain private prompts, code, or data. Do not run this skill on a machine with sensitive Claude sessions unless you audit exactly what it reads and stores. - File modifications: The executor can apply edits to SKILL.md files and will create backups/receipts. Review the execute/rollback code and test in a sandbox or a copy of your skills directory to confirm behavior and backup integrity before running on real skill files. - LLM/network calls: The evaluator/discriminator call LLM backends (claude CLI or Anthropic API). Evaluate privacy/cost implications and consider using mock/backends or offline testing first. - Prompt-injection indicator: The SKILL.md contains unicode control characters flagged by a static check. Inspect the SKILL.md for invisible characters and remove or justify them; such characters can cause unexpected parsing when documents are used as prompts. - Least privilege: Provide only the minimum credentials and limit filesystem scope (run in a container or on a copy) until you validate the behavior. Prefer read-only dry-run modes and use '--dry-run' where available. - Review and test: Because the repository is large and powerful, run the pipeline on non-production examples, read the code paths that perform subprocess calls and file writes (improvement-executor, autoloop-controller, session-feedback-analyzer), and ask the maintainer for an explicit list of env vars, required binaries/CLIs, and config paths. If you want, I can extract the exact lines that reference ANTHROPIC_API_KEY, the paths under ~/.claude, and the code paths that perform file writes so you can inspect them before running.
功能分析
Type: OpenClaw Skill Name: auto-improvement-orchestrator-skill Version: 1.0.3 The bundle is a highly sophisticated and well-engineered orchestration system designed for the automated evaluation and iterative improvement of AI agent skills. It implements a complex five-stage pipeline (Propose, Discriminate, Evaluate, Execute, and Gate) that includes advanced features such as Pareto front regression protection (lib/pareto.py), multi-role blind panel scoring (score.py), and Karpathy-style self-improvement loops (self_improve.py). While the system utilizes high-privilege capabilities—including modifying local skill files (execute.py), reading Claude session logs for implicit feedback (analyze.py), and executing AI-generated prompts via the 'claude' CLI (task_runner.py)—these behaviors are transparently documented and strictly necessary for the stated goal of skill optimization. The presence of security-conscious design patterns, such as path traversal validation in the PytestJudge (judges.py) and explicit opt-out checks for log analysis, confirms that the bundle is a legitimate developer tool rather than malicious software.
能力标签
cryptocan-make-purchasesrequires-oauth-token
能力评估
Purpose & Capability
The skill claims no required env vars or config paths in registry metadata, yet the code and README/SKILL.md clearly expect access to LLM credentials (e.g., ANTHROPIC_API_KEY / claude CLI), local Claude session logs (~/.claude/projects/*.jsonl), and the ability to apply edits to SKILL.md files (executor/rollback). Those capabilities are coherent with an auto-improvement orchestrator in principle, but the metadata omission is a mismatch that could lead to unexpected access requests at runtime.
Instruction Scope
SKILL.md and README instruct the agent to parse ~/.claude/projects JSONL, extract user feedback, run evaluation loops that call 'claude -p' or use ANTHROPIC API, and to apply changes to skills (with backup/rollback). That means the runtime will read user session files, run external LLM calls (network), and write/modify skill files — all beyond a minimal 'lint-only' scope. The SKILL.md is also relatively terse compared to the complex scripts (many CLIs/flags exist in code but are undocumented), increasing risk of unintended actions.
Install Mechanism
There is no formal install spec (instruction-only at registry level), which is low-install risk. However the bundle includes a large Python codebase and README suggests pip-installing pyyaml/pytest. The absence of an explicit install script or declared dependency list in the registry metadata is a documentation gap but not a direct high-risk installer (no external downloads or archives referenced).
Credentials
Although registry metadata lists no required env vars, multiple code modules and docs reference LLM credentials / CLIs (ANTHROPIC_API_KEY, usage of 'claude -p' and optional mock backends). The skill also expects read access to user session logs (~/.claude/projects) and to write backups/executions directories. Requesting LLM API keys and local session access is proportionate to an evaluator/orchestrator only if explicitly declared and justified; here that mapping is missing, which is a coherence and least-privilege concern.
Persistence & Privilege
The skill does not set always:true and is user-invocable (normal). It has code to modify SKILL.md and write backups/receipts within execution/backups — that is expected for an auto-editing pipeline. This is a meaningful privilege (ability to change local skill files), but it is scoped to its own operation rather than claiming system-wide persistent privileges. Still, because the skill can autonomously apply changes, users should treat it as powerful and run it under controlled conditions.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install auto-improvement-orchestrator-skill
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /auto-improvement-orchestrator-skill 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.3
v2.0: 9-dim evaluation, category modifiers, enriched docs
v1.0.2
v2.0: 9-dim evaluation, category modifiers, per-dim Pareto tolerances, enriched docs
v1.0.1
v2.0: 9-dim evaluation, category modifiers, per-dim Pareto tolerances, enriched docs
v1.0.0
Major update to skill orchestration and self-improvement pipeline. - Introduces structured 9-dimensional scoring and 4-role evaluation (including LLM-as-Judge). - Adds correction factors for different categories and Pareto front regression safeguards. - Implements trace-aware failure retries and batch/looped skill improvement. - Now includes 11 pipeline skills, 2 helper tools, and 2 verification targets. - Provides clear separation of use cases and integration points (see execution-harness for agent reliability).
元数据
Slug auto-improvement-orchestrator-skill
版本 1.0.3
许可证 MIT-0
累计安装 1
当前安装数 0
历史版本数 4
常见问题

Auto Improvement Orchestrator Skill 是什么?

Skill 自动评估和改进管线。9 维结构评分(含 LLM-as-Judge)、4 角色加权、 类别修正系数(tool/knowledge/orchestration/rule)、Pareto front 回归保护 (security 2%/efficiency 10%/其他 5%)、trace-aware 失败... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 118 次。

如何安装 Auto Improvement Orchestrator Skill?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install auto-improvement-orchestrator-skill」即可一键安装,无需额外配置。

Auto Improvement Orchestrator Skill 是免费的吗?

是的,Auto Improvement Orchestrator Skill 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Auto Improvement Orchestrator Skill 支持哪些平台?

Auto Improvement Orchestrator Skill 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Auto Improvement Orchestrator Skill?

由 _silhouette(@lanyasheng)开发并维护,当前版本 v1.0.3。

💬 留言讨论