← 返回 Skills 市场
Auto Improvement Orchestrator Skill
作者
_silhouette
· GitHub ↗
· v1.0.3
· MIT-0
118
总下载
0
收藏
0
当前安装
4
版本数
在 OpenClaw 中安装
/install auto-improvement-orchestrator-skill
功能描述
Skill 自动评估和改进管线。9 维结构评分(含 LLM-as-Judge)、4 角色加权、 类别修正系数(tool/knowledge/orchestration/rule)、Pareto front 回归保护 (security 2%/efficiency 10%/其他 5%)、trace-aware 失败...
使用说明 (SKILL.md)
Auto-Improvement Orchestrator
从评估到改进到验证的完整管线,让 Skill 自动变好。
When to Use
- 评估一个 skill 的质量(9 维打分 + 4 角色评审)
- 自动改进 SKILL.md(生成候选→打分→执行→门禁)
- 批量改进多个 skill(autoloop 连续运行)
- 从 Claude Code 会话日志提取用户反馈信号
- 对比 skill 改进前后的 Pareto front
When NOT to Use
- 手动编辑单个 SKILL.md → 直接改文件
- Agent 执行可靠性 → 用 execution-harness(独立仓库)
- 纯文档生成 → 用 doc-gen
- Prompt 优化(token 级)→ 用 DSPy
Quick Start
# 打分
python3 skills/improvement-learner/scripts/self_improve.py \
--skill-path /your/skill --max-iterations 1
# 自动改进 5 轮
python3 skills/improvement-learner/scripts/self_improve.py \
--skill-path /your/skill --max-iterations 5
# 从会话日志提取反馈
python3 skills/session-feedback-analyzer/scripts/analyze.py \
--output feedback.jsonl
Architecture
11 个管线 skill 分三层:
- 评估层: learner(9 维结构评分)、evaluator(执行测试)、session-feedback(用户反馈)
- 改进层: generator → discriminator → evaluator → executor → gate(6 层门禁)
- 控制层: autoloop(连续运行)、benchmark-store(Pareto front)、execution-harness(独立仓库)
辅助工具: skill-forge(造 skill)、skill-distill(合并 skill) 验证目标: prompt-hardening、deslop
Related
- execution-harness — Agent 执行可靠性(38 patterns × 6 axes)
安全使用建议
What to consider before installing/running this skill:
- Missing declarations: The registry says 'no required env vars / config paths', but the code and docs expect LLM credentials (e.g., ANTHROPIC_API_KEY or a 'claude -p' CLI) and access to your local Claude session logs (~/.claude/projects). Ask the publisher to document required secrets and file access clearly before use.
- Sensitive reads: The session-feedback analyzer reads ~/.claude/projects JSONL files. Those logs may contain private prompts, code, or data. Do not run this skill on a machine with sensitive Claude sessions unless you audit exactly what it reads and stores.
- File modifications: The executor can apply edits to SKILL.md files and will create backups/receipts. Review the execute/rollback code and test in a sandbox or a copy of your skills directory to confirm behavior and backup integrity before running on real skill files.
- LLM/network calls: The evaluator/discriminator call LLM backends (claude CLI or Anthropic API). Evaluate privacy/cost implications and consider using mock/backends or offline testing first.
- Prompt-injection indicator: The SKILL.md contains unicode control characters flagged by a static check. Inspect the SKILL.md for invisible characters and remove or justify them; such characters can cause unexpected parsing when documents are used as prompts.
- Least privilege: Provide only the minimum credentials and limit filesystem scope (run in a container or on a copy) until you validate the behavior. Prefer read-only dry-run modes and use '--dry-run' where available.
- Review and test: Because the repository is large and powerful, run the pipeline on non-production examples, read the code paths that perform subprocess calls and file writes (improvement-executor, autoloop-controller, session-feedback-analyzer), and ask the maintainer for an explicit list of env vars, required binaries/CLIs, and config paths.
If you want, I can extract the exact lines that reference ANTHROPIC_API_KEY, the paths under ~/.claude, and the code paths that perform file writes so you can inspect them before running.
功能分析
Type: OpenClaw Skill
Name: auto-improvement-orchestrator-skill
Version: 1.0.3
The bundle is a highly sophisticated and well-engineered orchestration system designed for the automated evaluation and iterative improvement of AI agent skills. It implements a complex five-stage pipeline (Propose, Discriminate, Evaluate, Execute, and Gate) that includes advanced features such as Pareto front regression protection (lib/pareto.py), multi-role blind panel scoring (score.py), and Karpathy-style self-improvement loops (self_improve.py). While the system utilizes high-privilege capabilities—including modifying local skill files (execute.py), reading Claude session logs for implicit feedback (analyze.py), and executing AI-generated prompts via the 'claude' CLI (task_runner.py)—these behaviors are transparently documented and strictly necessary for the stated goal of skill optimization. The presence of security-conscious design patterns, such as path traversal validation in the PytestJudge (judges.py) and explicit opt-out checks for log analysis, confirms that the bundle is a legitimate developer tool rather than malicious software.
能力标签
能力评估
Purpose & Capability
The skill claims no required env vars or config paths in registry metadata, yet the code and README/SKILL.md clearly expect access to LLM credentials (e.g., ANTHROPIC_API_KEY / claude CLI), local Claude session logs (~/.claude/projects/*.jsonl), and the ability to apply edits to SKILL.md files (executor/rollback). Those capabilities are coherent with an auto-improvement orchestrator in principle, but the metadata omission is a mismatch that could lead to unexpected access requests at runtime.
Instruction Scope
SKILL.md and README instruct the agent to parse ~/.claude/projects JSONL, extract user feedback, run evaluation loops that call 'claude -p' or use ANTHROPIC API, and to apply changes to skills (with backup/rollback). That means the runtime will read user session files, run external LLM calls (network), and write/modify skill files — all beyond a minimal 'lint-only' scope. The SKILL.md is also relatively terse compared to the complex scripts (many CLIs/flags exist in code but are undocumented), increasing risk of unintended actions.
Install Mechanism
There is no formal install spec (instruction-only at registry level), which is low-install risk. However the bundle includes a large Python codebase and README suggests pip-installing pyyaml/pytest. The absence of an explicit install script or declared dependency list in the registry metadata is a documentation gap but not a direct high-risk installer (no external downloads or archives referenced).
Credentials
Although registry metadata lists no required env vars, multiple code modules and docs reference LLM credentials / CLIs (ANTHROPIC_API_KEY, usage of 'claude -p' and optional mock backends). The skill also expects read access to user session logs (~/.claude/projects) and to write backups/executions directories. Requesting LLM API keys and local session access is proportionate to an evaluator/orchestrator only if explicitly declared and justified; here that mapping is missing, which is a coherence and least-privilege concern.
Persistence & Privilege
The skill does not set always:true and is user-invocable (normal). It has code to modify SKILL.md and write backups/receipts within execution/backups — that is expected for an auto-editing pipeline. This is a meaningful privilege (ability to change local skill files), but it is scoped to its own operation rather than claiming system-wide persistent privileges. Still, because the skill can autonomously apply changes, users should treat it as powerful and run it under controlled conditions.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install auto-improvement-orchestrator-skill - 安装完成后,直接呼叫该 Skill 的名称或使用
/auto-improvement-orchestrator-skill触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.3
v2.0: 9-dim evaluation, category modifiers, enriched docs
v1.0.2
v2.0: 9-dim evaluation, category modifiers, per-dim Pareto tolerances, enriched docs
v1.0.1
v2.0: 9-dim evaluation, category modifiers, per-dim Pareto tolerances, enriched docs
v1.0.0
Major update to skill orchestration and self-improvement pipeline.
- Introduces structured 9-dimensional scoring and 4-role evaluation (including LLM-as-Judge).
- Adds correction factors for different categories and Pareto front regression safeguards.
- Implements trace-aware failure retries and batch/looped skill improvement.
- Now includes 11 pipeline skills, 2 helper tools, and 2 verification targets.
- Provides clear separation of use cases and integration points (see execution-harness for agent reliability).
元数据
常见问题
Auto Improvement Orchestrator Skill 是什么?
Skill 自动评估和改进管线。9 维结构评分(含 LLM-as-Judge)、4 角色加权、 类别修正系数(tool/knowledge/orchestration/rule)、Pareto front 回归保护 (security 2%/efficiency 10%/其他 5%)、trace-aware 失败... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 118 次。
如何安装 Auto Improvement Orchestrator Skill?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install auto-improvement-orchestrator-skill」即可一键安装,无需额外配置。
Auto Improvement Orchestrator Skill 是免费的吗?
是的,Auto Improvement Orchestrator Skill 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Auto Improvement Orchestrator Skill 支持哪些平台?
Auto Improvement Orchestrator Skill 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Auto Improvement Orchestrator Skill?
由 _silhouette(@lanyasheng)开发并维护,当前版本 v1.0.3。
推荐 Skills