← 返回 Skills 市场
yezhaowang888-stack

Sharpagent Self Evolving

作者 yezhaowang888-stack · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
21
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install sharpagent-self-evolving
功能描述
SharpAgent Self-Evolving Loop — An automated 'Think→Do→Learn' cycle. Fuses the Self-Improving Agent's reflection mechanism with the autoresearch experimental...
使用说明 (SKILL.md)

SharpAgent Self-Evolving Loop v1.0.0

Make your agent smarter with every task. The end of one task is the starting point for the next evolution. Fuses the two key discoveries from R2: Self-Improving Agent reflection × autoresearch experiment verification.

Core Philosophy

Most agents finish a task and stop. The next time a similar problem comes up, it starts from scratch. No accumulation.

SharpAgent's self-evolving loop breaks this cycle:

① Execute task → ② Reflect ("What could be better?")
                   ↓
⑤ Absorb lesson → ③ Form improvement hypothesis
                   ↓
                ④ Run small experiment to verify
                     ↓
            (back to ②)

Every task is an evolution. It doesn't get more expensive with use — it gets more accurate.

Contract

contract:
  name: sharpagent-self-evolving
  version: "1.0.0"
  category: workflow
  trust_level: verified
  reads:
    - Task
    - LearningEntry
    - FiveFactorResult
  writes:
    - LearningEntry
    - ImprovementHypothesis
  preconditions:
    - "A completed task exists to reflect on"
    - "Access to read task output and logs"
  postconditions:
    - "Reflection produces at least 1 improvement hypothesis"
    - "If hypothesis is verifiable, an experiment is designed"
    - "Experiment outcome is recorded as LearningEntry"
  calibration:
    default_mode: professional
    modes_supported: [professional, deep]
  compliance:
    jurisdiction: global
    safety_level: standard
  lifecycle:
    status: active
    publish_as: SharpAgent

Lifecycle: 4-Phase Evolution Loop

 ┌─────────────────────────────────────────────┐
 │                                              │
 │   [1. REFLECT] → [2. HYPOTHESIZE]           │
 │       ↑                        ↓             │
 │   [4. ABSORB]  ←  [3. EXPERIMENT]           │
 │                                              │
 └─────────────────────────────────────────────┘

Phase 1: REFLECT — Analyze

After every task, do a structured reflection.

When:

  • Every task completion (mandatory)
  • Major errors mid-task (force deep mode)
  • Daily summary (optional, merge multiple reflections)

Reflection Framework:

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
🧬 Task Reflection
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
📋 Task: {task_name}
⏱  Duration: {duration}

🟢 What went right?
- {2-3 specific, quantifiable things}

🟡 What could improve?
- {1-3 things that could be better}

🔴 Clear mistakes?
- {If any: description + root cause + impact}

💡 Lesson learned
- {One-sentence lesson}

🧪 Improvement hypothesis
- {One clear, verifiable hypothesis}

Five-Factor Review Embedding: If the task involves information judgments, run each practice and lesson through the five factors:

🔗 Was my source credible?
🧠 Was my reasoning chain complete?
🌍 Compliance check?
🏳️ Any bias in chosen direction?
🔄 Any other sources to cross-verify?

Phase 2: HYPOTHESIZE — Form Hypothesis

Refine improvement ideas into verifiable hypotheses.

Hypothesis Format:

IF [I change approach] THEN [expected improvement] BECAUSE [reason]

Good vs Bad Hypotheses:

Bad Good
"Write better next time" "If I plan an outline for 30s before writing, title quality improves 20%"
"Check more sources" "If I check 2 independent sources before deciding, cross-validation score improves 15%"
"Don't make that error again" "If I add contract validation before commit, bug rate drops 30%"

Hypothesis Tiers:

Tier Meaning Action
🟢 P0 Critical improvement, fast (\x3C5 min) Experiment immediately
🟡 P1 Valuable, moderate effort (\x3C30 min) Queue for experiment
🔴 P2 Long-term, significant investment Record, experiment when possible

If reflection yields no improvement hypothesis → Check whether there's genuinely no room for improvement. 90% of the time the reflection wasn't honest enough.


Phase 3: EXPERIMENT — Verify

This is the core borrowed from autoresearch (karpathy/autoresearch ⭐80K).

Don't trust intuition that something is "better" — run a small experiment to prove it.

Experiment Cycle (borrowing autoresearch's 5-minute fixed budget):

experiment:
  budget: 5 min              # Fixed time budget
  hypothesis: "..."          # Hypothesis to verify
  setup:                     # Experiment setup
    - control: old approach
    - treatment: new approach
  measurements:              # Metrics
    - metric_1: "completion time"
    - metric_2: "error rate"
    - metric_3: "quality score"
  result:                    # Fill after experiment
    - metric_1: old=12s new=8s ✅
    - metric_2: old=3% new=1% ✅
    - metric_3: old=7/10 new=8.5/10 ✅
  verdict:                   # Conclusion
    - hypothesis_supported: true/false
    - adopt: yes/no/partial
    - notes: ""

Experiment Types:

Type Description Budget
A/B comparison Run old vs new, compare results 5 min
Ablation Remove one step to see impact 5 min
Boundary test Test stability under edge conditions 3 min
Cross-verification Different sources/methods for consistency 5 min

Experiment Discipline:

  1. Write hypothesis before experiment (prevents post-hoc rationalization)
  2. Control variables — change one thing at a time
  3. Record data, not feelings
  4. Failed experiments are still learning

Phase 4: ABSORB — Archive

Record the result regardless of success or failure. This is the fuel for evolution.

Archive as LearningEntry:

{
  "type": "LearningEntry",
  "category": "evolution",
  "task_ref": "xxx",
  "source": "self-evolving-loop",
  "lesson": "Planning outline first improved title quality 20%",
  "evidence": "A/B experiment: control=7/10, treatment=8.5/10, n=5",
  "adopted": true,
  "applied_count": 0,
  "created_at": "2026-05-11T06:05:00Z",
  "expiry": null
}

Category Tags:

Category Meaning Action
coding-pattern Code pattern improvement Auto-apply on next coding task
info-source Information source improvement Update monitor source priority
workflow Workflow optimization Update engineering lifecycle gates
tool-usage Tool usage skill Efficiency sequence
domain-knowledge Domain knowledge accumulation Long-term memory

Auto-Propagation:

  • If coding-pattern → write to ~/.agent-templates/
  • If info-source → update monitor config
  • If workflow → check if engineering lifecycle needs update
  • If lesson verified ≥3 times → promote to verified-best-practice

Full Cycle Example

Task: Analyze an AI paper

① Reflection
✅ Good: Structured extraction of method/results/limitations
🟡 Improve: Abstract always too long, user loses patience
🔴 Error: Forgot to check arXiv for updated version
💡: 150-char abstracts are read more often than 300-char ones

② Hypothesis
IF limit abstract to 150 chars THEN user read rate improves 30%
BECAUSE last analysis (300 chars) was only read halfway

③ Experiment
A/B: Same paper, 300-char version vs 150-char version
Result: 150-char version fully read, 300-char interrupted
Conclusion: ✅ Hypothesis supported, adopt

④ Absorb
Record as workflow lesson, update monitor output template

Edge Cases

Situation Action
Task execution failed Force deep reflection mode, focus on root cause
3 consecutive experiment failures Question hypothesis itself, check experiment design
Tiny task (rename variable) Skip loop, but log if recurring error pattern
Multiple reflections same day Merge into daily evolution summary
Hypothesis too abstract Break into verifiable sub-hypotheses
User says "no reflection needed" Skip but log to preference profile

Quality Gates

Check What Fail action
Reflection output At least 1 improvement hypothesis Reflect again
Hypothesis verifiable Has clear A/B or ablation plan Require refinement
Experiment has data Numbers not "feelings" Retest or mark unverifiable
Absorb archived Experiment result saved as LearningEntry Force archive
Self-reference Don't repeat same hypothesis weekly Mark as duplicate

Integration Points

Five-Factor Review

  • Phase 1 reflection judgments run through five factors
  • Learning entries carry FiveFactorResult as provenance

Engineering Lifecycle

  • Phase 2 hypothesis = engineering improvement proposal
  • Phase 3 experiment = verification phase
  • Successful experiments auto-update lifecycle best practices

Intelligence Monitor

  • Source evolution: unreliable sources from reflection auto-downranked in monitor

Version History

  • v1.0.0 — Initial release. 4-phase self-evolving loop: Reflect → Hypothesize → Experiment → Absorb.

SharpAgent · MIT-0 · 2026-05-11

安全使用建议
Install this only if you intentionally want a persistent self-improvement workflow. Before using it, decide where learning records will be stored, how to remove them, whether sensitive tasks should be excluded, and whether the reflection loop should require explicit approval after each task.
能力评估
Purpose & Capability
The self-evolving reflection loop is coherent with the stated purpose, and the sensitive behavior is mostly disclosed as part of the skill's main function.
Instruction Scope
The instructions make reflection mandatory after every task and force deeper reflection on errors, which can extend the agent's work beyond the user's original task unless explicitly controlled.
Install Mechanism
No install spec, code files, required binaries, environment variables, or credentials are present; the risk is instruction-level rather than executable-code-level.
Credentials
The skill asks for access to task outputs and logs and writes learning records, but the visible artifacts do not define storage location, retention, exclusions for sensitive data, or user review.
Persistence & Privilege
The skill is designed to archive lessons and reuse them across tasks, creating persistent agent state without clear boundaries, opt-out, or deletion controls in the provided artifact.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install sharpagent-self-evolving
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /sharpagent-self-evolving 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
- Initial release of SharpAgent Self-Evolving Loop v1.0.0. - Introduces an automated "Think→Do→Learn" cycle: after each task, reflect, generate improvement hypotheses, run quick experiments, and archive results as structured learning. - Integrates self-improving agent reflection with experimental validation from autoresearch methodology. - All experiments and lessons are stored as LearningEntry records for ongoing agent evolution. - Outlines clear contract, phases, templates, and auto-propagation of validated improvements for smarter workflows over time.
元数据
Slug sharpagent-self-evolving
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

Sharpagent Self Evolving 是什么?

SharpAgent Self-Evolving Loop — An automated 'Think→Do→Learn' cycle. Fuses the Self-Improving Agent's reflection mechanism with the autoresearch experimental... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 21 次。

如何安装 Sharpagent Self Evolving?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install sharpagent-self-evolving」即可一键安装,无需额外配置。

Sharpagent Self Evolving 是免费的吗?

是的,Sharpagent Self Evolving 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Sharpagent Self Evolving 支持哪些平台?

Sharpagent Self Evolving 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Sharpagent Self Evolving?

由 yezhaowang888-stack(@yezhaowang888-stack)开发并维护,当前版本 v1.0.0。

💬 留言讨论