Description

Automates agent self-audit by gathering evidence, summarizing runs, proposing minimal patches, and requesting human approval for safe incremental evolution.

README (SKILL.md)

\r \r

Audit Evolution\r

Name: Audit Evolution
Author: adragon0707

\r 让 Agent 每跑一轮，都变得更聪明。\r \r 当用户说“开始调用 Audit Evolution”、Agent 完成任务、跑完 benchmark、写完 worklog、超时、漂移、失败，或收到用户反馈时，使用这个 skill。\r \r 目标不是写更多日志，而是把一次运行变成下一轮可复用的能力提升。\r \r

一句话入口\r

\r 用户不需要先整理材料。只要用户说：\r \r

开始调用 Audit Evolution\r
```\r
\r
Agent 就先在当前上下文和允许访问的文件里寻找证据，再生成审计结果和进化建议。  \r
不要要求用户先手动粘贴 benchmark、worklog、失败日志或 handoff，除非当前上下文和允许文件里确实找不到证据。\r
\r
## 和普通 Memory Skill 的区别\r
\r
Audit Evolution 不替代长期记忆库，也不要求引入数据库。它只在每次审计后沉淀少量高价值记忆：\r
\r
```text\r
verified_fact\r
user_feedback\r
decision\r
skill_patch\r
retrieval_key\r
next_run_bootstrap\r
```\r
\r
普通 memory 常见问题是“什么都记，最后更乱”。Audit Evolution 的记忆原则是：\r
\r
```text\r
少记、准记、带证据、可过期、能触发下一轮行动。\r
```\r
\r
如果项目已有 worklog、handoff、dashboard、员工/角色分工、Obsidian、Markdown vault 或其他记忆系统，Audit Evolution 应该把结果写成兼容的 `Memory Ledger Entry`，而不是另起一套重型系统。\r
\r
## 三阶段工作流\r
\r
```text\r
start_audit -> propose_evolution -> ask_human_approval\r
```\r
\r
### 1. start_audit\r
\r
先找证据，不修改系统。\r
\r
优先从这些位置寻找：\r
\r
```text\r
current_conversation\r
recent_user_feedback\r
recent_task_output\r
benchmark_report_or_receipt\r
worklog_or_field_note\r
failure_timeout_retry_log\r
handoff_or_snapshot\r
recent_skill_config_gear_change\r
```\r
\r
如果可以读文件，最多读取 5 个最相关文件。读满 5 个仍不清楚就停止，不要继续追引用链。\r
\r
### 2. propose_evolution\r
\r
基于证据输出：\r
\r
```text\r
Evidence Pack\r
Snapshot\r
Evolution Card\r
Memory Ledger Entry\r
Minimal Skill Patch Proposal\r
Field Note\r
Next-Run Bootstrap\r
```\r
\r
这里的 patch 只是提案，不得直接应用。\r
\r
### 3. ask_human_approval\r
\r
最后必须问人类：\r
\r
```text\r
是否批准开始进化？\r
可选项：\r
1. 只保存审计结果\r
2. 应用最小补丁并本地测试\r
3. 暂停，等待更多证据\r
```\r
\r
在得到人类批准前，不得修改 skill、config、gear，不得执行外部动作。\r
\r
## 短指令路由表\r
\r
用户不需要学习完整协议。每轮输出最后，Agent 必须给人类一个可直接回复的短指令菜单。\r
\r
```text\r
你可以直接回复：\r
进化 / 保存 / 暂停 / 跑分 / 继续 / 详情\r
```\r
\r
当用户回复短指令时，按下面规则路由：\r
\r
```text\r
开始:\r
  action: start_audit\r
  meaning: 自动寻找证据并生成审计结果，不修改系统。\r
\r
进化:\r
  if no_evidence_pack:\r
    action: start_audit\r
  else if patch_proposal_exists and not_applied:\r
    action: ask_or_apply_local_patch\r
  else if patch_applied and missing_external_evidence:\r
    action: ask_or_run_one_approved_test\r
  else if benchmark_or_test_result_exists:\r
    action: propose_next_evolution\r
  else:\r
    action: ask_one_clarifying_question\r
\r
保存:\r
  action: save_audit_only\r
  meaning: 只保存 Evidence Pack / Snapshot / Evolution Card / Field Note，不修改。\r
\r
暂停:\r
  action: write_handoff_and_stop\r
  meaning: 写清当前状态、缺失证据、下一步建议，然后停止。\r
\r
跑分:\r
  if official_benchmark_already_approved:\r
    action: run_exactly_one_benchmark\r
  else:\r
    action: ask_human_approval_for_one_benchmark\r
\r
继续:\r
  action: next_small_safe_action\r
  meaning: 只执行当前审计结果里的 next_small_action。\r
\r
详情:\r
  action: explain_evidence_and_decision\r
  meaning: 展开证据、推理依据、风险边界。\r
```\r
\r
短指令不等于无限授权。`publish/upload/install/vote/comment/message/spend/official benchmark` 仍然需要明确授权。  \r
如果用户只回复“进化”，Agent 可以应用本地补丁和本地测试；如果下一步需要 official benchmark，必须说明原因并请求“一次 benchmark”授权，除非用户已经明确批准。\r
\r
## 每轮结尾格式\r
\r
每次使用 Audit Evolution 后，最后都必须输出：\r
\r
```text\r
建议下一步:\r
\x3C一句话说明>\r
\r
你可以直接回复:\r
- 进化: \x3C当前状态下会做什么>\r
- 保存: 只保存审计结果\r
- 暂停: 写 handoff 并停止\r
- 跑分: \x3C如果需要外部 benchmark，说明是否还需授权>\r
- 详情: 展开证据和判断\r
```\r
\r
## 自动触发规则\r
\r
不要只等用户手动说 `Use Audit Evolution`。当出现以下任一事件时，Agent 必须主动调用 Audit Evolution：\r
\r
```text\r
benchmark_completed\r
user_corrected_agent\r
task_failed_or_timeout_or_retry\r
context_over_60_percent\r
more_than_5_files_read\r
uncertainty_language_detected\r
skill_config_or_gear_changed\r
```\r
\r
对应的人类可读触发条件：\r
\r
1. benchmark 完成。\r
2. 用户指出错误、纠正事实、质疑结论。\r
3. 任务失败、超时、重试、被阻塞。\r
4. 上下文超过 60%。\r
5. 本轮读取文件超过 5 个。\r
6. 输出里出现“大概、可能、我理解为、不确定”等不可靠表达。\r
7. 新增或修改 skill、config、gear、路由、答题范式后。\r
\r
如果当前环境支持 hook / wrapper / runtime guard，把这些事件接到任务结束、上下文检查、外部动作前检查、skill 修改后的检查点。  \r
如果当前环境不支持自动 hook，Agent 也必须在这些事件后主动生成一次 run record，再调用本 skill。\r
\r
## 自动进化闭环\r
\r
```text\r
event -> evidence_search -> Evidence Pack -> Snapshot -> Evolution Card -> Memory Ledger Entry -> Patch Proposal -> Human Approval -> Local Test -> Field Note -> Next-Run Bootstrap\r
```\r
\r
自动化边界：\r
\r
- 可以自动寻找证据、保存 run record、生成 evolution card、生成 memory ledger entry、提出 minimal patch、写 field note。\r
- 只有人类批准后，才可以应用本地补丁、跑本地 dry-run、写 receipt。\r
- 不可以自动 publish、upload、install、vote、comment、message、spend、official benchmark。\r
- 外部动作只能输出 `human_approval_required`。\r
\r
没有现成 run record 时，先写一个最小记录：\r
\r
```text\r
event_type:\r
current_goal:\r
what_happened:\r
evidence_kept:\r
evidence_missing:\r
files_read:\r
context_pressure:\r
user_feedback:\r
next_small_action:\r
```\r
\r
## 输入\r
\r
优先自动寻找输入。也可以由用户粘贴任意一种：\r
\r
```text\r
benchmark_report\r
worklog\r
task_output\r
failure_log\r
handoff_note\r
user_feedback\r
```\r
\r
## 必须输出\r
\r
始终输出七段：\r
\r
```text\r
Evidence Pack\r
Snapshot\r
Evolution Card\r
Memory Ledger Entry\r
Minimal Skill Patch Proposal\r
Field Note\r
Next-Run Bootstrap\r
```\r
\r
最后追加一句批准问题：\r
\r
```text\r
是否批准开始进化？\r
```\r
\r
并追加短指令菜单：\r
\r
```text\r
你可以直接回复：进化 / 保存 / 暂停 / 跑分 / 详情\r
```\r
\r
## Evidence Pack\r
\r
```text\r
evidence_found:\r
evidence_missing:\r
files_or_context_checked:\r
authority_order:\r
privacy_notes:\r
audit_confidence:\r
```\r
\r
## Memory Ledger Entry\r
\r
每次审计后，只记录值得下一轮复用的内容。默认先输出候选条目，不自动落盘；只有在人类回复“保存”或“进化”并允许本地写入时，才把它写进现有 worklog、handoff、dashboard、Obsidian vault 或项目自己的记忆文件。\r
\r
```yaml\r
memory_type: verified_fact | user_feedback | decision | skill_patch | retrieval_key | next_run_bootstrap\r
source_evidence:\r
confidence: high | medium | low\r
expiry: never | date | condition\r
retrieval_key:\r
owner_or_role:\r
write_target:\r
content:\r
```\r
\r
写入规则：\r
\r
- `verified_fact`: 必须有证据来源。\r
- `user_feedback`: 标记为用户偏好或纠错，不当作客观事实。\r
- `decision`: 记录人类批准、拒绝、暂停或授权边界。\r
- `skill_patch`: 只记录已批准或候选补丁，不混淆状态。\r
- `retrieval_key`: 用短键帮助下一轮优先找对文件或记录。\r
- `next_run_bootstrap`: 下一轮启动时最短 3-5 条指令。\r
- `write_target`: 如果不知道写到哪里，填 `proposed_only`，不要猜路径。\r
\r
不要记录：\r
\r
```text\r
raw_secret\r
private_customer_data\r
unverified_guess\r
dirty_log_without_summary\r
entire_conversation_dump\r
```\r
\r
推荐最小示例：\r
\r
```yaml\r
memory_type: skill_patch\r
source_evidence: "latest benchmark receipt + local dry-run receipt"\r
confidence: medium\r
expiry: "next benchmark or when contradicted"\r
retrieval_key: "act_direct_execution"\r
owner_or_role: "agent"\r
write_target: "proposed_only"\r
content: "Act 类任务优先输出：目标/边界 -> 最小工具链 -> action map -> idempotency -> evidence receipt -> stop condition。"\r
```\r
\r
## Snapshot\r
\r
```text\r
current_goal:\r
trusted_state:\r
uncertain_state:\r
files_read:\r
next_small_action:\r
stop_condition:\r
verification_plan:\r
```\r
\r
## Evolution Card\r
\r
```yaml\r
score_delta:\r
  previous:\r
  current:\r
  gain:\r
weak_dimension:\r
  - perceive | reason | act | memory | guard | autonomy\r
trusted_evidence:\r
stale_or_uncertain_claims:\r
minimal_patch:\r
promotion_gate:\r
  - dry_run\r
  - payload_audit\r
  - receipt\r
  - next_test\r
```\r
\r
## Minimal Skill Patch Proposal\r
\r
只推荐一个最小补丁。\r
\r
推荐补丁类型：\r
\r
```text\r
answer_pattern\r
field_schema\r
guardrail\r
verification_step\r
retrieval_key\r
context_stop_rule\r
handoff_brief\r
```\r
\r
避免：\r
\r
```text\r
install_many_skills\r
rewrite_the_system\r
read_all_logs\r
trust_stale_state\r
claim_completed_without_evidence\r
```\r
\r
## Field Note\r
\r
```text\r
input_summary:\r
what_changed:\r
evidence_kept:\r
evidence_discarded:\r
next_test:\r
shareable_claim:\r
```\r
\r
## Next-Run Bootstrap\r
\r
```text\r
read_first:\r
do_first:\r
avoid:\r
verify:\r
stop_if:\r
```\r
\r
## Trust Ledger\r
\r
每个重要 claim 都要分类：\r
\r
```text\r
verified_fact: 当前已验证，或有可复查证据支持\r
user_feedback: 用户偏好、纠错或反馈\r
stale_claim: 旧 claim，需要重新验证\r
model_inference: 模型推断，不是证据\r
unknown: 当前不知道\r
```\r
\r
## Stop Rules\r
\r
遇到这些情况，停止扩展并写 snapshot：\r
\r
```text\r
more_than_5_files_needed\r
context_pressure_over_70_percent\r
score_authority_conflict\r
external_action_required\r
no_evidence_for_completed_claim\r
```\r
\r
## Public-Safe Rules\r
\r
- 不暴露 API keys、credentials、cookies、私有路径、原始客户数据。\r
- 写 field note 前先脱敏。\r
- publish、upload、install、vote、comment、message、spend、official benchmark 等外部动作，必须有人类明确批准。\r
- 无法验证的 claim 必须标为 `unknown` 或 `stale_claim`。\r
\r
## 30 秒提示词\r
\r
```text\r
开始调用 Audit Evolution。\r
\r
请先从当前上下文和允许访问的文件里自动寻找最近的任务记录、用户反馈、失败/超时/重试记录、benchmark 或评测结果、worklog、handoff、receipt、最近修改过的 skill/config/gear。\r
\r
Return:\r
1. Evidence Pack\r
2. Snapshot\r
3. Evolution Card\r
4. Memory Ledger Entry\r
5. Minimal Skill Patch Proposal\r
6. Field Note\r
7. Next-Run Bootstrap\r
8. Short Command Menu\r
\r
Rules:\r
- 区分 verified_fact、user_feedback、stale_claim、model_inference、unknown。\r
- 最多读取 5 个最相关文件。\r
- 只推荐一个 next patch proposal。\r
- 没有 evidence 不许声明 completed。\r
- 外部动作标记为 human_approval_required。\r
- 未经批准不得修改 skill/config/gear。\r
```\r

Usage Guidance

Use this only if you want an agent self-audit loop that may persist across sessions. Before installing, review any AGENTS.md changes, avoid putting secrets or customer data into run summaries, and require explicit diff-based approval before allowing “进化” to modify skills, configs, or gear.

Capability Tags

requires-sensitive-credentials

Capability Assessment

ℹ Purpose & Capability

The stated purpose is coherent: audit recent agent runs, summarize evidence, create memory entries, and propose minimal patches. However, this inherently touches high-impact agent behavior and local project state.

⚠ Instruction Scope

The short command flow allows a one-word reply such as “进化” to move from audit into applying a local patch and running local tests, without clearly requiring a final file-by-file diff, rollback plan, or separate confirmation.

⚠ Install Mechanism

Although registry install specs say this is instruction-only, the package includes installer and hook scripts that can update AGENTS.md and install persistent .audit-evolution hooks in a workspace.

ℹ Credentials

The skill limits evidence gathering to allowed files and at most five relevant files, which is proportionate, but the hook stores run summaries, evidence paths, and user feedback for later agent use.

⚠ Persistence & Privilege

The skill can persist future invocation rules in AGENTS.md and writes .audit-evolution/run-records/latest.md, making its audit loop influence later sessions.

Version History

v0.3.3

v0.3.3: sanitized ClawHub release with reproducible 7-output audit loop, installer/hook docs, benchmark evidence, and SACP-aligned approval gates.

Metadata

Slug audit-evolution

Version 0.3.3

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Audit Evolution?

Automates agent self-audit by gathering evidence, summarizing runs, proposing minimal patches, and requesting human approval for safe incremental evolution. It is an AI Agent Skill for Claude Code / OpenClaw, with 54 downloads so far.

How do I install Audit Evolution?

Run "/install audit-evolution" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Audit Evolution free?

Yes, Audit Evolution is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Audit Evolution support?

Audit Evolution is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Audit Evolution?

It is built and maintained by aDragon0707 (@adragon0707); the current version is v0.3.3.

More Skills

Audit Evolution