← 返回 Skills 市场
lanyasheng

Improvement Learner

作者 _silhouette · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
84
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install auto-improvement-learner
功能描述
当需要检查 skill 质量评分、自动优化 SKILL.md 结构、追踪评估分数变化趋势、或「评分低了想知道哪里扣分」时使用。6维结构评估 + HOT/WARM/COLD 三层记忆 + Pareto front。不用于候选语义打分(用 improvement-discriminator)或全流程编排(用 impr...
使用说明 (SKILL.md)

Improvement Learner

Real Karpathy self-improvement loop: evaluate → modify → re-evaluate → keep/revert → repeat.

When to Use

  • 查看一个 skill 在 6 个维度上的质量评分
  • 运行自动改进循环(Pareto front 保护,不允许任何维度回退)
  • 追踪 skill 评估分数的历史变化

When NOT to Use

  • 给改进候选打语义分 → use improvement-discriminator
  • 跑全流程(生成→打分→门禁→执行) → use improvement-orchestrator
  • 只想改一个文件 → use improvement-executor

6 Evaluation Dimensions

Dimension Checks Pure-text default
accuracy 15 items: frontmatter(3), symptom-driven desc, When to Use/Not, code examples, Usage, few-shot, no vague language, min length, Related Skills, Output Artifacts, atomicity
coverage SKILL.md = 60% base + scripts/references/tests/README bonuses
reliability pytest pass=1.0, fail=0.5 1.0 (pure-text)
efficiency Line count: ≤200=1.0, ≥1200=0.3
security No api_key/password/sk- in SKILL.md, no os.system()/exec()
trigger_quality Description length, triggers field, disambiguation

Three-Layer Memory

Layer Capacity Behavior
HOT ≤100 Always loaded, frequently accessed patterns
WARM Unlimited Overflow from HOT, loaded on demand
COLD Archive >3 months inactive (future)

\x3Cexample> 正确用法: 评估一个 skill 的质量 $ python3 scripts/self_improve.py --skill-path /path/to/skill --max-iterations 1 → 输出 JSON: {"final_scores": {"accuracy": 0.83, "coverage": 1.0, "reliability": 1.0, ...}} → accuracy 0.83 说明 SKILL.md 缺少部分检查项(如 Output Artifacts 或 Related Skills) \x3C/example>

\x3Canti-example> 错误判读: 纯文本 skill 的 reliability=1.0 不代表质量好 → 纯文本 skill 没有 scripts/,reliability 默认 1.0(没有代码就不需要测试) → 真正有意义的维度是 accuracy 和 trigger_quality \x3C/anti-example>

CLI

# 评估(不改动,只看分数)
python3 scripts/self_improve.py --skill-path /path/to/skill --max-iterations 1

# 自改进循环(5 轮)
python3 scripts/self_improve.py \
  --skill-path /path/to/skill \
  --max-iterations 5 \
  --memory-dir /path/to/memory \
  --state-root /path/to/state

# 追踪历史
python3 scripts/track_progress.py --skill-path /path/to/skill --output progress.json

Output Artifacts

Request Deliverable
Evaluate JSON with 6-dimension scores (0.0-1.0 each)
Self-improve JSON: iterations, kept/reverted/skipped, final_scores, memory stats
Track progress JSON with historical scores and trend data

Related Skills

  • improvement-discriminator: Semantic scoring (LLM judge); learner focuses on structural quality
  • improvement-orchestrator: Full pipeline; learner provides standalone quality scoring used by autoloop-controller and self-improvement loop (not a stage in the orchestrator pipeline)
  • benchmark-store: Pareto front data shared between learner and benchmark-store
安全使用建议
What to consider before installing/running: - The skill appears to do what it says (evaluate and auto-improve SKILL.md), but the Python scripts import external modules (lib.common, lib.pareto) that are not included; running may fail or silently import code from an unexpected repo root. Verify those dependencies exist and inspect them. - The script will call a local 'claude' CLI via subprocess.run when available. If you have a 'claude' binary configured, skill text may be sent through that client — treat SKILL.md and any files you point it at as potentially sent to that service. If you don't want that, run with the --mock flag or ensure 'claude' is not on PATH. - The tools write memory and report files to directories you specify (memory-dir, output); review and choose those paths to avoid exposing sensitive data. - Optional plotting requires matplotlib/numpy; tests expect a Python test runner. Run in a sandbox or isolated environment first to confirm behavior. - If you plan to let the agent invoke this autonomously, be aware the ability to call an external LLM client increases blast radius; consider restricting execution or reviewing the code paths that call subprocess.run. - To raise confidence to 'benign', provide the missing lib.* implementations or confirm they come from a trusted upstream, and validate that 'claude' usage is acceptable for your environment.
功能分析
Type: OpenClaw Skill Name: auto-improvement-learner Version: 1.0.0 The bundle implements an automated 'self-improvement loop' for OpenClaw skills, designed to evaluate and optimize skill quality across dimensions like accuracy, reliability, and security. The core logic in `scripts/self_improve.py` uses an LLM-as-judge (via the `claude` CLI) and structural heuristics to score skills, while incorporating defensive checks that specifically penalize hardcoded secrets and dangerous execution patterns (e.g., `os.system`). The tool includes safety mechanisms such as automated backups, a 'Pareto front' logic to prevent performance regressions, and a three-layer memory system for tracking improvement patterns. No malicious intent, data exfiltration, or unauthorized persistence mechanisms were detected.
能力评估
Purpose & Capability
The SKILL.md, CLI examples, and Python scripts all implement a self-improvement / evaluation loop as described. However, the scripts import lib.common and lib.pareto from a repo root that is not included in the bundle; those external libraries are required for normal operation but are not declared in the skill metadata. The code also expects a local 'claude' CLI for LLM-based judging (with a regex fallback).
Instruction Scope
Runtime instructions only ask you to run the included scripts (evaluate, self-improve, track progress). The scripts read SKILL.md and reports directories and write memory and report files. They do not instruct access to unrelated system paths or secrets, but they do call an external LLM CLI ('claude') via subprocess, which will send skill content to that client when available.
Install Mechanism
There is no install spec (instruction-only plus included scripts). No remote downloads or archive extraction are present in the bundle itself, reducing installation risk. However, runtime requires Python and optional plotting libs (matplotlib/numpy) that are not declared.
Credentials
The skill declares no required environment variables or credentials. Nevertheless, it invokes an external LLM client ('claude') if present, which is an undocumented runtime dependency and could transmit evaluation content to that service. No secrets are requested by the skill itself.
Persistence & Privilege
always is false and the skill does not request system-wide privileges. It writes memory files to a user-specified memory-dir and report files to output directories; it does not modify other skills or global agent configuration.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install auto-improvement-learner
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /auto-improvement-learner 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release: closed-loop skill improvement pipeline
元数据
Slug auto-improvement-learner
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

Improvement Learner 是什么?

当需要检查 skill 质量评分、自动优化 SKILL.md 结构、追踪评估分数变化趋势、或「评分低了想知道哪里扣分」时使用。6维结构评估 + HOT/WARM/COLD 三层记忆 + Pareto front。不用于候选语义打分(用 improvement-discriminator)或全流程编排(用 impr... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 84 次。

如何安装 Improvement Learner?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install auto-improvement-learner」即可一键安装,无需额外配置。

Improvement Learner 是免费的吗?

是的,Improvement Learner 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Improvement Learner 支持哪些平台?

Improvement Learner 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Improvement Learner?

由 _silhouette(@lanyasheng)开发并维护,当前版本 v1.0.0。

💬 留言讨论