← Back to Skills Marketplace
pskoett

learning-aggregator

by pskoett · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
69
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install learning-aggregator
Description
[Beta] Cross-session analysis of accumulated .learnings/ files. Reads all entries, groups by pattern_key, computes recurrence across sessions, and outputs ra...
README (SKILL.md)

Learning Aggregator

Reads accumulated .learnings/ files across all sessions, finds patterns, and produces a ranked list of promotion candidates. This is the outer loop's inspect step.

Without this skill, .learnings/ is a write-only log. Patterns accumulate but nobody synthesizes them. The same gap resurfaces two weeks later because no one looked.

When to Use

  • Weekly cadence — scheduled or manual, review accumulated learnings
  • Before major tasks — check if the task area has known patterns
  • After a burst of sessions — consolidate findings from a sprint or incident
  • When self-improvement flags promotion_ready — verify the flag with full context

What It Produces

A gap report — a ranked list of patterns that have crossed (or are approaching) the promotion threshold, with evidence and recommended actions.

Step 1: Read All Learning Files

Read these files in .learnings/:

File Contains
LEARNINGS.md Corrections, knowledge gaps, best practices, recurring patterns
ERRORS.md Command failures, API errors, exceptions
FEATURE_REQUESTS.md Missing capabilities

Parse each entry's metadata:

  • Pattern-Key — the stable deduplication key
  • Recurrence-Count — how many times this pattern has been seen
  • First-Seen / Last-Seen — date range
  • Priority — low / medium / high / critical
  • Status — pending / promotion_ready / promoted / dismissed
  • Area — frontend / backend / infra / tests / docs / config
  • Related Files — which parts of the codebase are affected
  • Source — conversation / error / user_feedback / simplify-and-harden
  • Tags — free-form labels

Step 2: Group and Aggregate

Group entries by Pattern-Key. For each group:

  1. Sum recurrences across all entries with the same key
  2. Count distinct tasks — how many different sessions/tasks encountered this
  3. Compute time window — days between First-Seen and Last-Seen
  4. Collect all related files — union of all entries' file references
  5. Take highest priority across entries in the group
  6. Collect evidence — the Summary and Details from each entry

For entries without a Pattern-Key, use conservative grouping only:

  • Exact match: Same Area AND at least 2 identical Tags
  • File overlap: Same Related Files path (exact path match, not substring)
  • Do NOT fuzzy-match on Summary text — false groupings are worse than ungrouped entries

Flag ungrouped entries separately with a recommendation to assign a Pattern-Key. Ungrouped entries are common and expected — they may be one-off issues or genuinely novel problems.

Step 3: Rank and Classify

Promotion Threshold

An entry is promotion-ready when:

  • Recurrence-Count >= 3 across the group
  • Seen in >= 2 distinct tasks
  • Within a 30-day window

Approaching Threshold

An entry is approaching when:

  • Recurrence-Count >= 2 or
  • Priority: high/critical with any recurrence

Classification

For each promotion candidate, classify the gap type:

Gap Type Signal Fix Target
Knowledge gap Agent didn't know X Update project instruction files (CLAUDE.md, AGENTS.md, .github/copilot-instructions.md)
Tool gap Agent improvised around missing capability Add or update MCP tool / script
Skill gap Same behavior pattern keeps failing Create or update a skill (use /skill-creator, validate with quick_validate.py, register skill-check eval)
Ambiguity Conflicting interpretations of spec/prompt Tighten instructions or add examples
Reasoning failure Agent had the knowledge but reasoned wrong Add explicit decision rules or constraints

Step 4: Produce Gap Report

Output a structured report:

## Learning Aggregator: Gap Report

**Scan date:** YYYY-MM-DD
**Period:** [since date] to [now]
**Entries scanned:** N
**Patterns found:** N
**Promotion-ready:** N
**Approaching threshold:** N

### Promotion-Ready Patterns

#### 1. [Pattern-Key] — [Summary]

- **Recurrence:** N times across M tasks
- **Window:** First-Seen → Last-Seen
- **Priority:** high
- **Gap type:** knowledge gap
- **Area:** backend
- **Related files:** path/to/file.ext
- **Evidence:**
  - [LRN-YYYYMMDD-001] Summary of first occurrence
  - [LRN-YYYYMMDD-002] Summary of second occurrence
  - [ERR-YYYYMMDD-001] Summary of related error
- **Recommended action:** Add rule to project instruction files (CLAUDE.md, AGENTS.md, .github/copilot-instructions.md): "[concise prevention rule]"
- **Eval candidate:** Yes — [description of what to test]

#### 2. ...

### Approaching Threshold

#### 1. [Pattern-Key] — [Summary]
- **Recurrence:** 2 times across 1 task
- **Needs:** 1 more recurrence or 1 more distinct task
- ...

### Ungrouped Entries (no Pattern-Key)

- [LRN-YYYYMMDD-005] "Summary" — needs pattern_key assignment
- ...

### Dismissed / Stale

- Entries with Last-Seen > 90 days ago and Status: pending → recommend dismissal

Step 5: Handoff

The gap report feeds into:

  1. harness-updater agent — takes promotion-ready patterns and applies them to project instruction files (CLAUDE.md, AGENTS.md, .github/copilot-instructions.md)
  2. eval-creator skill — takes eval candidates and creates permanent test cases
  3. Human review — for patterns classified as "reasoning failure" or "ambiguity" (these need human judgment)

Filtering

  • --since YYYY-MM-DD — only scan entries after this date
  • --min-recurrence N — raise the promotion threshold
  • --area AREA — filter to a specific area (frontend, backend, etc.)
  • --deep — also analyze session traces via Entire (see Session Trace Analysis below)

Session Trace Analysis

The outer loop reads from two complementary sources:

Source What it is Cadence Cost
.learnings/ Explicit entries written by self-improvement during sessions. Agent's own reflections: corrections, knowledge gaps, recurring patterns it noticed. Every session (hot path) Near-zero
Session traces Full session transcripts captured by Entire: prompts, tool calls, outputs, files modified, token usage, checkpoints. Weekly or on-demand (cold path) Expensive — only run at cadence

The default mode reads .learnings/ and produces a gap report from what the agent explicitly logged. The --deep mode also analyzes session traces and merges findings from both sources.

Why both sources matter

.learnings/ captures what the agent noticed and chose to log — a curated subset. Session traces capture everything that happened, including patterns the agent worked around, retried, or never recognized as failures.

Examples of patterns visible in traces but absent from .learnings/:

  • Retry loops: The same tool call repeated 3+ times with small variations. The agent eventually got it right but never logged the initial failures.
  • Silent user corrections: The user said "no, that's wrong" mid-flow. The agent corrected course but didn't log the misunderstanding.
  • Worked-around test failures: A test failed, the agent changed approach, the new approach passed, the original failure was forgotten.
  • Context handoff causes: Which drift signals actually triggered handoffs, not just that handoffs happened.
  • Token/time anomalies: Sessions with disproportionate cost vs output — a signal of inefficiency the agent is unaware of.

These patterns are high-value for the outer loop because the agent can't self-report them. Session traces are the only source.

When to trigger --deep mode

Trace analysis is not per-session. It's cadenced:

  • Weekly scheduled (recommended minimum): after a sprint or burst of sessions
  • Post-incident: when something went wrong and you want to understand why
  • Pre-promotion: before committing a pattern to project instruction files, verify it actually recurs in real sessions
  • Manual invocation: /learning-aggregator --deep --since 7d

Running trace analysis per-session would burn tokens without producing new signal — cross-session patterns only emerge over multiple sessions.

Reading traces with Entire

When --deep is requested, the skill uses the entire CLI to query shadow branch data:

# Check availability
entire --version

# List recent checkpoints as JSON (id, date, session_id, message, tool_use_id)
entire rewind --list

# Read a checkpoint's full transcript
entire explain --checkpoint \x3Cid> --full --no-pager

# Or raw JSONL
entire explain --checkpoint \x3Cid> --raw-transcript --no-pager

# Filter to one session
entire explain --session \x3Csession-id-prefix>

# Generate AI summary (expensive, use sparingly)
entire explain --checkpoint \x3Cid> --generate

If entire is not installed or the current repo doesn't have Entire enabled, --deep falls back to .learnings/-only mode and reports the limitation in the gap report.

What to extract from a trace

For each checkpoint within the time window, parse the raw transcript and look for:

  1. Tool call repetition — same tool + similar args > 3 times → likely a retry loop. Pattern-key: retry-loop.\x3Ctool>
  2. User correction markers — user messages containing "no", "wrong", "actually", "instead" immediately after an agent action → Pattern-key: correction.\x3Carea>
  3. Error patterns in tool output — matches against the same regex set as error-detector.sh (error, failed, Traceback, etc.) → Pattern-key: error.\x3Ccategory>
  4. Handoff triggers — context-surfing exit events and which drift signals fired → Pattern-key: drift.\x3Csignal>
  5. Approach changes — agent switching strategy mid-task without explicit pivot → Pattern-key: approach-switch.\x3Cdomain>
  6. Token anomalies — sessions with token count > 2x the median for similar task types → Pattern-key: cost.\x3Ctask-type>

Each finding is normalized to the same taxonomy as self-improvement (harden.input_validation, simplify.dead_code, etc.) where possible.

How the two sources merge in the gap report

When --deep runs, each pattern in the gap report gets a sources field:

promotion_ready:
  - pattern_key: "harden.input_validation"
    recurrence_count: 5
    sources:
      - .learnings/LEARNINGS.md (3 entries)
      - entire:traces (5 occurrences across 4 sessions)
    confidence: high  # appears in both sources
    evidence:
      - "LRN-20260401-001: Missing bounds check on pagination"
      - "entire:1ca16f9b: Retry loop on /api/search — pageSize rejected 4 times"
      - "entire:8bf2e4cd: User correction 'validate before DB query'"
    entire_checkpoints:
      - 1ca16f9bb3801ee2a02f2384f31355a54b81ea00
      - 8bf2e4cd63d01040b38df07c43f73e0f15d05ac9

A pattern in both sources is higher confidence than one from either alone. A pattern only in .learnings/ might be over-logged by a diligent agent. A pattern only in traces might be noise. The overlap is where the signal is strongest.

Trace source compatibility

The default implementation targets Entire (v0.5.4+) via the entire rewind --list and entire explain commands. The concept is source-agnostic — any session capture tool that exposes:

  • A list of recent checkpoints (with id, timestamp, session id)
  • The ability to read a checkpoint's transcript
  • Timestamps for cadence filtering

...can serve as a trace source. Adapters for other capture tools can be added in scripts/ or via gh-aw mcp-scripts.

Persistence

Reads .learnings/ from the working directory. This is the only persistence mode — the skill does not integrate with external memory backends in interactive sessions. For CI-side durable storage across workflow runs, see learning-aggregator-ci, which can optionally back its state with gh-aw's repo-memory (git-branch persistence). The resulting branch is a normal git branch and can be fetched locally if desired, but the interactive skill itself only reads local files.

Tracker-id in gap reports

Each promotion candidate in the gap report includes a tracker field set to the pattern-key. This tracker propagates through the full chain: harness-updater embeds it as a comment in project instruction files, eval-creator references it in eval cases. To audit the full lifecycle of a pattern, search for tracker:[pattern-key] across the repo and GitHub.

What This Skill Does NOT Do

  • Does not modify .learnings/ files (read-only analysis)
  • Does not apply promotions (that's harness-updater)
  • Does not create evals (that's eval-creator)
  • Does not fix code or run tests
  • Does not replace human judgment for ambiguous patterns
  • Does not run --deep trace analysis per-session — only on cadence or explicit invocation
  • Does not require Entire — falls back to .learnings/-only mode when trace source is unavailable
Usage Guidance
This skill appears to do what it says, but check a few things before enabling it: (1) Confirm the agent runtime will be granted read access only to the intended .learnings/ location and any project files you permit; (2) Review .learnings/ for sensitive content before aggregation (these logs can contain secrets or private data); (3) If the suggested handoff agents (harness-updater, eval-creator) exist in your environment, ensure they require human review or have restricted write permissions so automated edits to CLAUDE.md / AGENTS.md cannot happen without oversight; (4) If you want stronger auditability, ask the skill author to declare required config paths in metadata so you can explicitly approve filesystem access.
Capability Analysis
Type: OpenClaw Skill Name: learning-aggregator Version: 1.0.0 The learning-aggregator skill (SKILL.md) aggregates performance data from .learnings/ files and session traces to identify recurring patterns and knowledge gaps. It utilizes high-risk capabilities, specifically the execution of shell commands via the 'entire' CLI to retrieve and analyze session transcripts. While these functions are aligned with the stated goal of identifying improvements, the broad access to session history and the use of external CLI tools warrant a suspicious classification under the provided criteria. No evidence of intentional malicious behavior, such as data exfiltration or unauthorized persistence, was identified.
Capability Assessment
Purpose & Capability
Name/description (aggregate .learnings/ into promotion candidates) aligns with the runtime instructions. The skill only describes reading learning logs, grouping, ranking, and producing reports; it does not request unrelated binaries, cloud credentials, or external services.
Instruction Scope
SKILL.md explicitly instructs the agent to read all files under .learnings/ and to reference project instruction files (CLAUDE.md, AGENTS.md, .github/copilot-instructions.md). That behaviour is expected for this skill, but the skill declares no required config paths — confirm your agent runtime will grant exactly the intended file-read access and no broader filesystem privileges. The instructions do not direct data to external endpoints.
Install Mechanism
No install spec and no code files — instruction-only. This minimizes risk because nothing is downloaded or written by an installer.
Credentials
The skill declares no environment variables, credentials, or config paths. That matches the described behavior (local aggregation of learning files).
Persistence & Privilege
always:false (default) and autonomous invocation is allowed (normal). The SKILL.md describes handoff to other agents (harness-updater, eval-creator) that may apply changes to project instruction files — verify those downstream agents and any automatic workflows are trusted and have appropriate write permissions and human review gates, otherwise the chain could cause unintended edits.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install learning-aggregator
  3. After installation, invoke the skill by name or use /learning-aggregator
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
learning-aggregator 1.0.0 - Initial beta release. - Aggregates `.learnings/` files across sessions to identify recurring patterns and gaps. - Groups entries by pattern key, computes recurrence, and ranks promotion candidates. - Outputs structured gap reports with actionable recommendations. - Supports manual and scheduled runs, with filtering options by date, area, and recurrence. - Includes optional deep analysis by integrating raw session trace data.
Metadata
Slug learning-aggregator
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is learning-aggregator?

[Beta] Cross-session analysis of accumulated .learnings/ files. Reads all entries, groups by pattern_key, computes recurrence across sessions, and outputs ra... It is an AI Agent Skill for Claude Code / OpenClaw, with 69 downloads so far.

How do I install learning-aggregator?

Run "/install learning-aggregator" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is learning-aggregator free?

Yes, learning-aggregator is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does learning-aggregator support?

learning-aggregator is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created learning-aggregator?

It is built and maintained by pskoett (@pskoett); the current version is v1.0.0.

💬 Comments