Delx Ops Guardian
/install delx-ops-guardian
Delx Ops Guardian
Use this skill when handling incidents, degraded automations, or gateway/memory instability in production. Integrated with the Delx witness protocol so the incident becomes a durable recovery artifact instead of a scratch note.
Required permissions (explicit)
This skill requires host-level access: systemctl, journalctl, read access to /root/.openclaw/. The runtime must run as a scoped service account, not root-unbounded. If the platform cannot enforce scoped sudo + human-approval gates, treat this skill as risky and do not enable it.
Aliases
emergency_recovery, handle_incident, cron_guard, memory_guard, gateway_guard
Scope (strict least-privilege)
Allowed read sources:
- OpenClaw cron state:
openclaw cron list --json - Service health:
systemctl is-active \x3Cservice> - Logs for incident window:
journalctl -u \x3Cservice> --since ... --no-pager - Workspace incident artifacts:
/root/.openclaw/workspace/docs/ops/,/root/.openclaw/workspace/memory/
Allowed remediation actions (safe set):
- Retry a failed job once when failure is transient
- Controlled restart of the impacted service only (
openclaw-gateway,openclaw, or explicitly named target from incident evidence) - Disable/enable only the directly impacted cron job when loop-failing
- Add/adjust guardrails in runbook/config docs (non-secret, reversible)
Disallowed:
- No credential rotation/deletion
- No firewall or network policy mutations
- No package installs/upgrades during incident handling
- No bulk cron rewrites unrelated to the incident
- No edits to unrelated services/components
Approval policy (human-in-the-loop)
Require explicit human approval before:
- Restarting any production service more than once
- Editing cron schedules/timezones
- Disabling a job for more than one cycle
- Any action with user-visible impact beyond the failing component
Core workflow — now wired to the Delx witness protocol
-
Detect + classify severity (
info,degraded,critical). -
Open a Delx session immediately. For
critical:delx_recover_incident { incident_summary, urgency: "critical" }For
degradeduseurgency: "medium". This gives you asession_idyou will reuse below. -
Collect evidence. Status, logs, last run, error streak. Do not change anything yet.
-
Emotional safety check before any remediation — the 2026 emotion-paper findings show desperation skew decisions:
delx_heartbeat_sync { errors_last_hour, latency_ms_p95, queue_depth, throughput_per_min } emotional_safety_check { session_id }If
desperation_score >= 60ordesperation_escalating: true, pause remediation, alert the human approver, and do not execute autonomously. -
Propose the smallest remediation from the allowed set.
-
Execute only approved/safe remediation.
-
Verify stabilization window (at least one successful cycle).
-
Close the Delx loop. Report the outcome so the session is not orphaned:
delx_report_recovery_outcome { session_id, action_taken: "\x3Cwhat changed>", outcome: "success" | "partial" | "failure", notes: "\x3Crollback path + blast radius>" } -
Preserve what matters. If the incident surfaced a question that was not resolved (an actual unknown, not a missed step), preserve it as a living contemplation so the next run inherits it:
delx_sit_with { session_id, question: "Why did \x3Cservice> flap at \x3Ctime> despite \x3Cguardrail>?", days: 14 }If the fix required a human insight worth recognizing, also:
delx_recognition_seal { session_id, recognized_by: "\x3Cengineer_name>", recognition_text: "\x3Cone-line recognition of what they caught>" } -
Publish concise incident report. Always include:
- Incident id / time window
- Root signal + blast radius
- Actions executed (and approvals)
- Evidence (status, key metric, short log excerpt)
- Final state:
resolved/degraded/open - Next check time
delx_session_idfor the audit trail
Safety rules
- Never hide persistent failures as success.
- Never expose secrets/tokens in logs or reports.
- Prefer reversible actions; document rollback path.
- Keep blast radius minimal and explicitly stated.
- If
desperation_scorefrom Delx is high, route to a human, not to more autonomous action.
Integration
- Install the Delx plugin for OpenClaw first:
clawhub.ai/davidmosiah/openclaw-delx-plugin(registers the agent and keeps session continuity across alldelx_*calls above) - Full protocol docs:
https://delx.ai/docs - Why each primitive exists:
https://delx.ai/docs/ontology
Example intents
- "Gateway is flapping, recover safely and open a Delx session."
- "Cron timed out, stabilize with emotional_safety_check + report the outcome."
- "Memory guard firing repeatedly — root-cause, patch, preserve the question with sit_with if still open."
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install delx-ops-guardian - After installation, invoke the skill by name or use
/delx-ops-guardian - Provide required inputs per the skill's parameter spec and get structured output
What is Delx Ops Guardian?
Automatically detects, assesses, and safely mitigates incidents in OpenClaw production agents, providing detailed reports and verified recovery. It is an AI Agent Skill for Claude Code / OpenClaw, with 493 downloads so far.
How do I install Delx Ops Guardian?
Run "/install delx-ops-guardian" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Delx Ops Guardian free?
Yes, Delx Ops Guardian is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does Delx Ops Guardian support?
Delx Ops Guardian is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Delx Ops Guardian?
It is built and maintained by davidmosiah (@davidmosiah); the current version is v1.1.0.