← Back to Skills Marketplace

COE Root Cause

Name: COE Root Cause
Author: ghitafilali

by ghitafilali · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ✓ Security Clean

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install coe-root-cause

Description

Run a Correction of Error root-cause analysis for recurring failures, false success, missed work, data loss, and brittle automation.

README (SKILL.md)

COE Root Cause

Use when the user asks for a COE, Correction of Error, postmortem, root-cause analysis, "why did this recur", "what was missed", or "do not let this happen again".

The job is to explain the mechanism that allowed the failure, fix the mechanism where possible, and prove the same class of failure is harder to repeat.

Library Fit

Use this skill for a formal post-failure Correction of Error: a recurring failure, false success, missed work, data loss, brittle automation, or user-visible miss that needs a written record with impact, timeline, root cause, corrective actions, and verification.

Adjacent skills keep their narrower jobs:

Debugging or investigation skills handle active bugs before the failure mechanism is understood.
Review skills handle pre-landing diff or PR risk.
Retrospective skills summarize engineering trends over a time window.
Skill-creation skills turn a proven workflow or corrective action into a durable skill, script, test, or guardrail.

Rules

Classify the failure before rerunning or changing anything.
Do not stop at symptoms like "timeout", "model failed", "tool failed", or "human error".
Preserve concrete evidence: logs, command output, diffs, tests, screenshots, report paths, source references, or exact user-visible behavior.
Redact secrets, tokens, personally identifying information, customer data, and private workspace details. Prefer source references or short excerpts over raw dumps, especially in public artifacts.
Ask before public, destructive, expensive, or externally visible actions.
Keep private workspace, customer, or user details out of public artifacts unless the user explicitly approves disclosure.
If the user asked only for a report or analysis, propose corrective actions instead of applying code or workflow changes.
Every corrective action needs verification evidence. If it cannot be verified, rewrite it.

Failure Classification

Classify the failure before changing anything. Name the primary failure mode:

Required work failed visibly: command, job, test, or pipeline failed and the required work did not complete.
Required work silently skipped or falsely succeeded: the system reported done while required work was missing.
Required work completed incompletely or incorrectly: an artifact exists but is partial, stale, under-extracted, or wrong enough to matter.
User-visible response missed the expectation: the answer omitted a request, misrouted the work, or gave inaccurate status.
Optional diagnostic failed only: a non-required search, probe, or log lookup failed while required work is independently verified.

Then identify evidence-backed contributing conditions:

timeout, rate limit, or transient provider failure
missing file, schema drift, or dependency drift
model configuration, policy, or routing mismatch
source availability or extractor failure
brittle command, parser, query, or ad hoc script
unclear ownership, interface, or skill instruction
absent verification, closeout, or blocked-state gate
other or unknown, with the evidence still missing

If an optional diagnostic failure hides whether required work happened, reclassify it as false success, incomplete work, or visible failure. Do not let "optional" obscure the primary task.

Evidence Packet

Collect the smallest packet that explains the failure:

user request or expectation
promised behavior
actual behavior
first bad observable result
affected scope
relevant logs, reports, code paths, and tests
existing guardrail that should have caught it

State uncertainty plainly. Do not bury the answer in unrelated logs.

Analysis Loop

Build a short timeline with timestamps or ordered events.
Run at least 5 Whys.
Continue past 5 if the answer is still a symptom, vague human explanation, or unverifiable guess.
Separate proximate cause from root cause.
Name the missing guardrail, unclear interface, unsafe default, or unchecked assumption that let the issue recur or become user-visible.

Bad root causes:

"the agent forgot"
"the model made a mistake"
"we should be more careful"
"the command failed"
"the user did not specify enough"

Good root causes identify a durable fix: a test, validator, workflow gate, ownership boundary, safer default, clearer skill instruction, or explicit blocked-state receipt.

Corrective Actions

For each action, include:

owner or owning surface
exact change
status: done, planned, blocked, or rejected
verification evidence
expected future detection signal

Prefer class-level safeguards over one-off cleanup.

Verification Gate

Before saying the COE is complete, run the smallest credible verification:

targeted regression test
static validation for generated docs or frontmatter
dry run against the failed case
closeout checklist mapping each user request to evidence
local AI/code review for nontrivial diffs

If a gate cannot run, say why and what evidence substitutes for it.

Report Template

# COE: \x3Cfailure name>

Date: \x3Cdate>
Status: done | planned | blocked
Severity: low | medium | high

## Summary

One short paragraph: what failed, why it mattered, and what changed.

## Impact

- Who or what was affected
- What was wrong or missing
- What was not affected

## Timeline

- \x3Ctime/order>: \x3Cevent>

## Failure Classification

Failure mode: \x3Cprimary failure mode from Failure Classification and why>
Contributing conditions: \x3Csupported conditions, or unknown with missing evidence>

## Evidence

- \x3Csource or command>: \x3Cwhat it proves>

## Root Cause

### 5+ Whys

1. Why? ...

### Root Cause Statement

\x3Cmechanism, not blame>

## Corrective Actions

| Action | Status               | Verification |
| ------ | -------------------- | ------------ |
| ...    | done/planned/blocked | ...          |

## Verification

- \x3Cgate>: \x3Cresult>

## Residual Risk

\x3Cwhat could still fail and how it will be noticed>

Closeout

Lead with the root cause and verified fix. Keep the user-facing summary short. If anything remains open, say exactly what evidence is still missing.

Usage Guidance

Safe to install for guided incident and root-cause analysis. Generated COE reports may reference sensitive failure evidence, so review and redact logs, screenshots, customer details, secrets, and private workspace information before sharing them publicly.

Capability Assessment

✓ Purpose & Capability

The instructions fit the stated COE root-cause purpose: classify failures, collect evidence, identify causes, propose corrective actions, and verify outcomes.

✓ Instruction Scope

The skill explicitly limits risky actions by requiring approval before public, destructive, expensive, or externally visible actions, and it instructs redaction of secrets, PII, customer data, and private workspace details.

✓ Install Mechanism

The reviewed artifact is a markdown SKILL.md file only; no scripts, dependencies, install hooks, network setup, or executable payloads are present.

ℹ Credentials

It may guide an agent to inspect logs, diffs, tests, screenshots, and reports, but that access is expected for post-failure analysis and is paired with privacy-preserving guidance.

✓ Persistence & Privilege

No persistence, background workers, privilege escalation, credential use, or automatic configuration changes are requested.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install coe-root-cause
After installation, invoke the skill by name or use /coe-root-cause
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

Initial release

Metadata

Slug coe-root-cause

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is COE Root Cause?

Run a Correction of Error root-cause analysis for recurring failures, false success, missed work, data loss, and brittle automation. It is an AI Agent Skill for Claude Code / OpenClaw, with 38 downloads so far.

How do I install COE Root Cause?

Run "/install coe-root-cause" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is COE Root Cause free?

Yes, COE Root Cause is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does COE Root Cause support?

COE Root Cause is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created COE Root Cause?

It is built and maintained by ghitafilali (@ghitafilali); the current version is v1.0.0.

More Skills

COE Root Cause

COE Root Cause

Library Fit

Rules

Failure Classification

Evidence Packet

Analysis Loop

Corrective Actions

Verification Gate

Report Template

Closeout

What is COE Root Cause?

How do I install COE Root Cause?

Is COE Root Cause free?

Which platforms does COE Root Cause support?

Who created COE Root Cause?

💬 Comments