← Back to Skills Marketplace
lanyasheng

Improvement Learner

by _silhouette · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
84
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install auto-improvement-learner
Description
当需要检查 skill 质量评分、自动优化 SKILL.md 结构、追踪评估分数变化趋势、或「评分低了想知道哪里扣分」时使用。6维结构评估 + HOT/WARM/COLD 三层记忆 + Pareto front。不用于候选语义打分(用 improvement-discriminator)或全流程编排(用 impr...
README (SKILL.md)

Improvement Learner

Real Karpathy self-improvement loop: evaluate → modify → re-evaluate → keep/revert → repeat.

When to Use

  • 查看一个 skill 在 6 个维度上的质量评分
  • 运行自动改进循环(Pareto front 保护,不允许任何维度回退)
  • 追踪 skill 评估分数的历史变化

When NOT to Use

  • 给改进候选打语义分 → use improvement-discriminator
  • 跑全流程(生成→打分→门禁→执行) → use improvement-orchestrator
  • 只想改一个文件 → use improvement-executor

6 Evaluation Dimensions

Dimension Checks Pure-text default
accuracy 15 items: frontmatter(3), symptom-driven desc, When to Use/Not, code examples, Usage, few-shot, no vague language, min length, Related Skills, Output Artifacts, atomicity
coverage SKILL.md = 60% base + scripts/references/tests/README bonuses
reliability pytest pass=1.0, fail=0.5 1.0 (pure-text)
efficiency Line count: ≤200=1.0, ≥1200=0.3
security No api_key/password/sk- in SKILL.md, no os.system()/exec()
trigger_quality Description length, triggers field, disambiguation

Three-Layer Memory

Layer Capacity Behavior
HOT ≤100 Always loaded, frequently accessed patterns
WARM Unlimited Overflow from HOT, loaded on demand
COLD Archive >3 months inactive (future)

\x3Cexample> 正确用法: 评估一个 skill 的质量 $ python3 scripts/self_improve.py --skill-path /path/to/skill --max-iterations 1 → 输出 JSON: {"final_scores": {"accuracy": 0.83, "coverage": 1.0, "reliability": 1.0, ...}} → accuracy 0.83 说明 SKILL.md 缺少部分检查项(如 Output Artifacts 或 Related Skills) \x3C/example>

\x3Canti-example> 错误判读: 纯文本 skill 的 reliability=1.0 不代表质量好 → 纯文本 skill 没有 scripts/,reliability 默认 1.0(没有代码就不需要测试) → 真正有意义的维度是 accuracy 和 trigger_quality \x3C/anti-example>

CLI

# 评估(不改动,只看分数)
python3 scripts/self_improve.py --skill-path /path/to/skill --max-iterations 1

# 自改进循环(5 轮)
python3 scripts/self_improve.py \
  --skill-path /path/to/skill \
  --max-iterations 5 \
  --memory-dir /path/to/memory \
  --state-root /path/to/state

# 追踪历史
python3 scripts/track_progress.py --skill-path /path/to/skill --output progress.json

Output Artifacts

Request Deliverable
Evaluate JSON with 6-dimension scores (0.0-1.0 each)
Self-improve JSON: iterations, kept/reverted/skipped, final_scores, memory stats
Track progress JSON with historical scores and trend data

Related Skills

  • improvement-discriminator: Semantic scoring (LLM judge); learner focuses on structural quality
  • improvement-orchestrator: Full pipeline; learner provides standalone quality scoring used by autoloop-controller and self-improvement loop (not a stage in the orchestrator pipeline)
  • benchmark-store: Pareto front data shared between learner and benchmark-store
Usage Guidance
What to consider before installing/running: - The skill appears to do what it says (evaluate and auto-improve SKILL.md), but the Python scripts import external modules (lib.common, lib.pareto) that are not included; running may fail or silently import code from an unexpected repo root. Verify those dependencies exist and inspect them. - The script will call a local 'claude' CLI via subprocess.run when available. If you have a 'claude' binary configured, skill text may be sent through that client — treat SKILL.md and any files you point it at as potentially sent to that service. If you don't want that, run with the --mock flag or ensure 'claude' is not on PATH. - The tools write memory and report files to directories you specify (memory-dir, output); review and choose those paths to avoid exposing sensitive data. - Optional plotting requires matplotlib/numpy; tests expect a Python test runner. Run in a sandbox or isolated environment first to confirm behavior. - If you plan to let the agent invoke this autonomously, be aware the ability to call an external LLM client increases blast radius; consider restricting execution or reviewing the code paths that call subprocess.run. - To raise confidence to 'benign', provide the missing lib.* implementations or confirm they come from a trusted upstream, and validate that 'claude' usage is acceptable for your environment.
Capability Analysis
Type: OpenClaw Skill Name: auto-improvement-learner Version: 1.0.0 The bundle implements an automated 'self-improvement loop' for OpenClaw skills, designed to evaluate and optimize skill quality across dimensions like accuracy, reliability, and security. The core logic in `scripts/self_improve.py` uses an LLM-as-judge (via the `claude` CLI) and structural heuristics to score skills, while incorporating defensive checks that specifically penalize hardcoded secrets and dangerous execution patterns (e.g., `os.system`). The tool includes safety mechanisms such as automated backups, a 'Pareto front' logic to prevent performance regressions, and a three-layer memory system for tracking improvement patterns. No malicious intent, data exfiltration, or unauthorized persistence mechanisms were detected.
Capability Assessment
Purpose & Capability
The SKILL.md, CLI examples, and Python scripts all implement a self-improvement / evaluation loop as described. However, the scripts import lib.common and lib.pareto from a repo root that is not included in the bundle; those external libraries are required for normal operation but are not declared in the skill metadata. The code also expects a local 'claude' CLI for LLM-based judging (with a regex fallback).
Instruction Scope
Runtime instructions only ask you to run the included scripts (evaluate, self-improve, track progress). The scripts read SKILL.md and reports directories and write memory and report files. They do not instruct access to unrelated system paths or secrets, but they do call an external LLM CLI ('claude') via subprocess, which will send skill content to that client when available.
Install Mechanism
There is no install spec (instruction-only plus included scripts). No remote downloads or archive extraction are present in the bundle itself, reducing installation risk. However, runtime requires Python and optional plotting libs (matplotlib/numpy) that are not declared.
Credentials
The skill declares no required environment variables or credentials. Nevertheless, it invokes an external LLM client ('claude') if present, which is an undocumented runtime dependency and could transmit evaluation content to that service. No secrets are requested by the skill itself.
Persistence & Privilege
always is false and the skill does not request system-wide privileges. It writes memory files to a user-specified memory-dir and report files to output directories; it does not modify other skills or global agent configuration.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install auto-improvement-learner
  3. After installation, invoke the skill by name or use /auto-improvement-learner
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release: closed-loop skill improvement pipeline
Metadata
Slug auto-improvement-learner
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Improvement Learner?

当需要检查 skill 质量评分、自动优化 SKILL.md 结构、追踪评估分数变化趋势、或「评分低了想知道哪里扣分」时使用。6维结构评估 + HOT/WARM/COLD 三层记忆 + Pareto front。不用于候选语义打分(用 improvement-discriminator)或全流程编排(用 impr... It is an AI Agent Skill for Claude Code / OpenClaw, with 84 downloads so far.

How do I install Improvement Learner?

Run "/install auto-improvement-learner" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Improvement Learner free?

Yes, Improvement Learner is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Improvement Learner support?

Improvement Learner is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Improvement Learner?

It is built and maintained by _silhouette (@lanyasheng); the current version is v1.0.0.

💬 Comments