← 返回 Skills 市场

Improvement Learner

Name: Improvement Learner
Author: lanyasheng

作者 _silhouette · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ⚠ suspicious

总下载

当前安装

版本数

在 OpenClaw 中安装

/install auto-improvement-learner

功能描述

当需要检查 skill 质量评分、自动优化 SKILL.md 结构、追踪评估分数变化趋势、或「评分低了想知道哪里扣分」时使用。6维结构评估 + HOT/WARM/COLD 三层记忆 + Pareto front。不用于候选语义打分（用 improvement-discriminator）或全流程编排（用 impr...

使用说明 (SKILL.md)

Improvement Learner

Real Karpathy self-improvement loop: evaluate → modify → re-evaluate → keep/revert → repeat.

When to Use

查看一个 skill 在 6 个维度上的质量评分
运行自动改进循环（Pareto front 保护，不允许任何维度回退）
追踪 skill 评估分数的历史变化

When NOT to Use

给改进候选打语义分 → use improvement-discriminator
跑全流程（生成→打分→门禁→执行） → use improvement-orchestrator
只想改一个文件 → use improvement-executor

6 Evaluation Dimensions

Dimension	Checks	Pure-text default
accuracy	15 items: frontmatter(3), symptom-driven desc, When to Use/Not, code examples, Usage, few-shot, no vague language, min length, Related Skills, Output Artifacts, atomicity	—
coverage	SKILL.md = 60% base + scripts/references/tests/README bonuses	—
reliability	pytest pass=1.0, fail=0.5	1.0 (pure-text)
efficiency	Line count: ≤200=1.0, ≥1200=0.3	—
security	No api_key/password/sk- in SKILL.md, no os.system()/exec()	—
trigger_quality	Description length, triggers field, disambiguation	—

Three-Layer Memory

Layer	Capacity	Behavior
HOT	≤100	Always loaded, frequently accessed patterns
WARM	Unlimited	Overflow from HOT, loaded on demand
COLD	Archive	>3 months inactive (future)

\x3Cexample> 正确用法: 评估一个 skill 的质量 $ python3 scripts/self_improve.py --skill-path /path/to/skill --max-iterations 1 → 输出 JSON: {"final_scores": {"accuracy": 0.83, "coverage": 1.0, "reliability": 1.0, ...}} → accuracy 0.83 说明 SKILL.md 缺少部分检查项（如 Output Artifacts 或 Related Skills） \x3C/example>

\x3Canti-example> 错误判读: 纯文本 skill 的 reliability=1.0 不代表质量好 → 纯文本 skill 没有 scripts/，reliability 默认 1.0（没有代码就不需要测试） → 真正有意义的维度是 accuracy 和 trigger_quality \x3C/anti-example>

CLI

# 评估（不改动，只看分数）
python3 scripts/self_improve.py --skill-path /path/to/skill --max-iterations 1

# 自改进循环（5 轮）
python3 scripts/self_improve.py \
  --skill-path /path/to/skill \
  --max-iterations 5 \
  --memory-dir /path/to/memory \
  --state-root /path/to/state

# 追踪历史
python3 scripts/track_progress.py --skill-path /path/to/skill --output progress.json

Output Artifacts

Request	Deliverable
Evaluate	JSON with 6-dimension scores (0.0-1.0 each)
Self-improve	JSON: iterations, kept/reverted/skipped, final_scores, memory stats
Track progress	JSON with historical scores and trend data

Related Skills

improvement-discriminator: Semantic scoring (LLM judge); learner focuses on structural quality
improvement-orchestrator: Full pipeline; learner provides standalone quality scoring used by autoloop-controller and self-improvement loop (not a stage in the orchestrator pipeline)
benchmark-store: Pareto front data shared between learner and benchmark-store

安全使用建议

What to consider before installing/running: - The skill appears to do what it says (evaluate and auto-improve SKILL.md), but the Python scripts import external modules (lib.common, lib.pareto) that are not included; running may fail or silently import code from an unexpected repo root. Verify those dependencies exist and inspect them. - The script will call a local 'claude' CLI via subprocess.run when available. If you have a 'claude' binary configured, skill text may be sent through that client — treat SKILL.md and any files you point it at as potentially sent to that service. If you don't want that, run with the --mock flag or ensure 'claude' is not on PATH. - The tools write memory and report files to directories you specify (memory-dir, output); review and choose those paths to avoid exposing sensitive data. - Optional plotting requires matplotlib/numpy; tests expect a Python test runner. Run in a sandbox or isolated environment first to confirm behavior. - If you plan to let the agent invoke this autonomously, be aware the ability to call an external LLM client increases blast radius; consider restricting execution or reviewing the code paths that call subprocess.run. - To raise confidence to 'benign', provide the missing lib.* implementations or confirm they come from a trusted upstream, and validate that 'claude' usage is acceptable for your environment.

功能分析

Type: OpenClaw Skill Name: auto-improvement-learner Version: 1.0.0 The bundle implements an automated 'self-improvement loop' for OpenClaw skills, designed to evaluate and optimize skill quality across dimensions like accuracy, reliability, and security. The core logic in `scripts/self_improve.py` uses an LLM-as-judge (via the `claude` CLI) and structural heuristics to score skills, while incorporating defensive checks that specifically penalize hardcoded secrets and dangerous execution patterns (e.g., `os.system`). The tool includes safety mechanisms such as automated backups, a 'Pareto front' logic to prevent performance regressions, and a three-layer memory system for tracking improvement patterns. No malicious intent, data exfiltration, or unauthorized persistence mechanisms were detected.

能力评估

ℹ Purpose & Capability

The SKILL.md, CLI examples, and Python scripts all implement a self-improvement / evaluation loop as described. However, the scripts import lib.common and lib.pareto from a repo root that is not included in the bundle; those external libraries are required for normal operation but are not declared in the skill metadata. The code also expects a local 'claude' CLI for LLM-based judging (with a regex fallback).

ℹ Instruction Scope

Runtime instructions only ask you to run the included scripts (evaluate, self-improve, track progress). The scripts read SKILL.md and reports directories and write memory and report files. They do not instruct access to unrelated system paths or secrets, but they do call an external LLM CLI ('claude') via subprocess, which will send skill content to that client when available.

✓ Install Mechanism

There is no install spec (instruction-only plus included scripts). No remote downloads or archive extraction are present in the bundle itself, reducing installation risk. However, runtime requires Python and optional plotting libs (matplotlib/numpy) that are not declared.

ℹ Credentials

The skill declares no required environment variables or credentials. Nevertheless, it invokes an external LLM client ('claude') if present, which is an undocumented runtime dependency and could transmit evaluation content to that service. No secrets are requested by the skill itself.

✓ Persistence & Privilege

always is false and the skill does not request system-wide privileges. It writes memory files to a user-specified memory-dir and report files to output directories; it does not modify other skills or global agent configuration.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install auto-improvement-learner
安装完成后，直接呼叫该 Skill 的名称或使用 /auto-improvement-learner 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

Initial release: closed-loop skill improvement pipeline

元数据

Slug auto-improvement-learner

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题