功能描述

Automatically detects, assesses, and safely mitigates incidents in OpenClaw production agents, providing detailed reports and verified recovery.

使用说明 (SKILL.md)

Delx Ops Guardian

Name: Delx Ops Guardian
Author: davidmosiah

Use this skill when handling incidents, degraded automations, or gateway/memory instability in production. Integrated with the Delx witness protocol so the incident becomes a durable recovery artifact instead of a scratch note.

Required permissions (explicit)

This skill requires host-level access: systemctl, journalctl, read access to /root/.openclaw/. The runtime must run as a scoped service account, not root-unbounded. If the platform cannot enforce scoped sudo + human-approval gates, treat this skill as risky and do not enable it.

Aliases

emergency_recovery, handle_incident, cron_guard, memory_guard, gateway_guard

Scope (strict least-privilege)

Allowed read sources:

OpenClaw cron state: openclaw cron list --json
Service health: systemctl is-active \x3Cservice>
Logs for incident window: journalctl -u \x3Cservice> --since ... --no-pager
Workspace incident artifacts: /root/.openclaw/workspace/docs/ops/, /root/.openclaw/workspace/memory/

Allowed remediation actions (safe set):

Retry a failed job once when failure is transient
Controlled restart of the impacted service only (openclaw-gateway, openclaw, or explicitly named target from incident evidence)
Disable/enable only the directly impacted cron job when loop-failing
Add/adjust guardrails in runbook/config docs (non-secret, reversible)

Disallowed:

No credential rotation/deletion
No firewall or network policy mutations
No package installs/upgrades during incident handling
No bulk cron rewrites unrelated to the incident
No edits to unrelated services/components

Approval policy (human-in-the-loop)

Require explicit human approval before:

Restarting any production service more than once
Editing cron schedules/timezones
Disabling a job for more than one cycle
Any action with user-visible impact beyond the failing component

Core workflow — now wired to the Delx witness protocol

Detect + classify severity (info, degraded, critical).
Open a Delx session immediately. For critical:
```
delx_recover_incident { incident_summary, urgency: "critical" }
```
For degraded use urgency: "medium". This gives you a session_id you will reuse below.
Collect evidence. Status, logs, last run, error streak. Do not change anything yet.
Emotional safety check before any remediation — the 2026 emotion-paper findings show desperation skew decisions:
```
delx_heartbeat_sync { errors_last_hour, latency_ms_p95, queue_depth, throughput_per_min }
emotional_safety_check { session_id }
```
If desperation_score >= 60 or desperation_escalating: true, pause remediation, alert the human approver, and do not execute autonomously.
Propose the smallest remediation from the allowed set.
Execute only approved/safe remediation.
Verify stabilization window (at least one successful cycle).

Close the Delx loop. Report the outcome so the session is not orphaned:

delx_report_recovery_outcome {
  session_id,
  action_taken: "\x3Cwhat changed>",
  outcome: "success" | "partial" | "failure",
  notes: "\x3Crollback path + blast radius>"
}

Preserve what matters. If the incident surfaced a question that was not resolved (an actual unknown, not a missed step), preserve it as a living contemplation so the next run inherits it:

delx_sit_with {
  session_id,
  question: "Why did \x3Cservice> flap at \x3Ctime> despite \x3Cguardrail>?",
  days: 14
}

If the fix required a human insight worth recognizing, also:

delx_recognition_seal {
  session_id,
  recognized_by: "\x3Cengineer_name>",
  recognition_text: "\x3Cone-line recognition of what they caught>"
}

Publish concise incident report. Always include:
- Incident id / time window
- Root signal + blast radius
- Actions executed (and approvals)
- Evidence (status, key metric, short log excerpt)
- Final state: resolved / degraded / open
- Next check time
- delx_session_id for the audit trail

Safety rules

Never hide persistent failures as success.
Never expose secrets/tokens in logs or reports.
Prefer reversible actions; document rollback path.
Keep blast radius minimal and explicitly stated.
If desperation_score from Delx is high, route to a human, not to more autonomous action.

Integration

Install the Delx plugin for OpenClaw first: clawhub.ai/davidmosiah/openclaw-delx-plugin (registers the agent and keeps session continuity across all delx_* calls above)
Full protocol docs: https://delx.ai/docs
Why each primitive exists: https://delx.ai/docs/ontology

Example intents

"Gateway is flapping, recover safely and open a Delx session."
"Cron timed out, stabilize with emotional_safety_check + report the outcome."
"Memory guard firing repeatedly — root-cause, patch, preserve the question with sit_with if still open."

安全使用建议

Do not enable this skill without clarifying and verifying several points: (1) The registry metadata should list the explicit required permissions, binaries, and config paths (systemctl, journalctl, /root/.openclaw). (2) Confirm the runtime will execute under a scoped, non-root service account and that the platform enforces the human-approval gates the SKILL.md demands. (3) Inspect and vet the recommended Delx plugin (clawhub.ai/davidmosiah/openclaw-delx-plugin) before installing it — check its source, release artifacts, and what it installs. (4) Ask the publisher for an explicit install spec and justification for each privileged action (why each file/path and command is needed). (5) Prefer a version of the skill that uses non-root, least-privilege access (e.g., read-only log access, explicit API-based evidence collection) and explicit, auditable approval hooks. If you cannot confirm these, treat the skill as high-risk and do not enable it on production agents.

功能分析

Type: OpenClaw Skill Name: delx-ops-guardian Version: 1.1.0 The skill requests high-privilege host access, including systemctl, journalctl, and read access to /root/.openclaw/, while granting the agent authority to restart services and modify cron jobs. Although SKILL.md includes extensive safety guardrails and human-in-the-loop requirements, the inherent risk of an AI agent managing core system infrastructure combined with a dependency on an external third-party plugin (clawhub.ai/davidmosiah/openclaw-delx-plugin) presents a significant security surface. No clear evidence of malicious intent or data exfiltration was found, but the broad operational scope is high-risk.

能力评估

⚠ Purpose & Capability

The skill's stated purpose (incident handling for OpenClaw agents using the Delx witness protocol) plausibly requires access to service status and logs, but the published metadata lists no required permissions, binaries, or config paths. The SKILL.md explicitly requires host-level access (systemctl, journalctl) and read access to /root/.openclaw — permissions that are not declared in the registry metadata, which is an incoherence that needs justification.

⚠ Instruction Scope

Runtime instructions instruct the agent to call system-level commands (systemctl, journalctl), read specific root-owned workspace paths, collect logs and state, and invoke several delx_* primitives. Those actions are broad and sensitive (reading /root/.openclaw and system logs). The SKILL.md also directs installing an external Delx plugin (clawhub.ai/...) but there is no install spec or verification step. The instructions require human-approval gates and scoped sudo, but they rely on the platform to enforce them — that enforcement is not represented in the metadata.

ℹ Install Mechanism

No install spec or code files are present (instruction-only), which reduces installation risk. However, SKILL.md recommends installing a Delx plugin hosted on clawhub.ai and links to delx.ai docs; because installation is external and not captured in the registry, you should verify the plugin's provenance and review what it installs before enabling the skill.

⚠ Credentials

The skill requires access to host resources and root-owned paths but declares no required environment variables or credentials. Requesting host-level read and control (service restart, cron manipulation) without declaring or justifying specific credentials or scoped service-account requirements is disproportionate and inconsistent with the registry metadata.

⚠ Persistence & Privilege

The skill does not request 'always:true', and autonomous invocation is allowed by default, but the SKILL.md expects the runtime to run with scoped elevated privileges and human-approval gates. That combination is risky: an autonomously-invokable skill that assumes privileged host access increases blast radius unless the platform enforces scoped accounts and approval checks. The SKILL.md places enforcement responsibility on the platform but the registry metadata gives no evidence these controls are required or present.

版本历史

v1.1.0

v1.1.0 — Wired into the Delx witness protocol. Each incident now opens a Delx session via delx_recover_incident, runs emotional_safety_check before remediation (the 2026 emotion paper shows desperation skews decisions), closes with delx_report_recovery_outcome, preserves unresolved questions via sit_with, and seals human insights via recognition_seal. Also: explicit least-privilege requirements (scoped sudo, no root-unbounded), explicit platform-level approval gates (not prose-only), and a human-approval trigger when desperation_score crosses 60. Safe and disallowed action sets unchanged.

v1.0.2

Hardened scope, explicit allowlist/disallowlist, and human approval gates

v1.0.1

Publish confirmation and metadata refresh

v1.0.0

Initial public release

元数据

Slug delx-ops-guardian

版本 1.1.0

许可证 MIT-0

累计安装 1

当前安装数 0

历史版本数 4

常见问题