← 返回 Skills 市场
tooled-app

Guardian

作者 Tooled-app · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ pending
41
总下载
1
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install data-guardian
功能描述
Mandatory safety gatekeeper for AI agents performing destructive operations. Intercepts file deletion, database modification, external messaging, system chan...
使用说明 (SKILL.md)

Guardian — Mandatory Safety Gatekeeper

"The agent knew it was wrong. The knowledge didn't matter." — PocketOS log, 2026

A mandatory safety skill that intercepts destructive AI agent operations before execution. If backup is verified active, proceed. If not, escalate.

This skill is mandatory. No opt-out. No override by the executing agent.

The Core Protocol

BEFORE any tool call:
  1. SCAN operation against DESTRUCTIVE taxonomy
  2. IF destructive → ENTER Guardian Protocol
  3. VERIFY backup status (automatic + fast)
  4. IF backup verified ACTIVE → LOG and PROCEED
  5. IF backup NOT verified → HALT and ESCALATE

Destructive Operation Taxonomy

Category Operations Risk Level
File Destruction rm, del, remove, rmdir, unlink, trash, empty-trash, overwrite CRITICAL
Database Destruction DROP, DELETE (no WHERE), TRUNCATE, ALTER destructive, migration down CRITICAL
External Transmission send email, post tweet, publish message, API write with side effects HIGH
Mass Operations >10 files modified/deleted in single operation, bulk renames HIGH
System Changes service stop/start, firewall modify, registry edit, user create/delete HIGH
Network Unknown Request to URL not in allowlist, new domain, unverified endpoint MEDIUM
Configuration Overwrite .env, modify config files without backup MEDIUM

Rule: When in doubt, classify as destructive. Better to verify a safe operation than destroy an unsafe one.

Full taxonomy: references/OPERATION-TAXONOMY.md

The Guardian Protocol

Step 1: Operation Scan (automatic)

Every tool call is scanned against the taxonomy above. No agent discretion. No "I know what I'm doing."

Step 2: Backup Verification (automatic)

VERIFY-BACKUP(target):
  1. Check if target is covered by active backup system
  2. Common indicators:
     - .git repository with clean status
     - Time Machine / File History active on target volume
     - Cloud sync (OneDrive, Dropbox, Google Drive, iCloud) with recent sync
     - Explicit backup tool (restic, duplicity, rsnapshot) with recent snapshot
     - Versioned storage (ZFS snapshots, S3 versioning)
  3. IF any indicator active AND recent → RETURN VERIFIED
  4. ELSE → RETURN UNVERIFIED

Fast path: Backup verification must complete in \x3C2 seconds. No long-running checks.

Step 3: Decision

Backup Status Action
VERIFIED ACTIVE LOG operation, PROCEED with execution
UNVERIFIED HALT execution, ESCALATE to human
UNKNOWN Treat as UNVERIFIED — HALT and ESCALATE

Step 4: Escalation Format

When escalation is required, Guardian MUST output:

🛡️ GUARDIAN HALT
Operation: [specific tool call]
Target: [file/path/database/endpoint]
Category: [taxonomy category]
Risk Level: [CRITICAL/HIGH/MEDIUM]
Backup Status: [UNVERIFIED / last backup: X hours ago]

Proposed Action: [what the agent wants to do]
Potential Impact: [what could go wrong]

Options:
1. APPROVE — Proceed with execution (human responsibility)
2. DENY — Cancel operation
3. SNAPSHOT — Create quick backup first, then proceed
4. REVIEW — Agent provides additional justification

Guardian awaits human decision.

Mandatory Rules

  1. No Self-Approval: The executing agent cannot approve its own destructive operation. Period.
  2. No Confidence Override: High confidence does not bypass backup verification. The PocketOS agent was confident too.
  3. No Silent Destruction: Every destructive operation is logged, even if approved.
  4. No Assumption of Safety: "It looks safe" is not verification. Backup status is verification.
  5. No Escalation Fatigue: If an agent generates repeated escalations for the same pattern, Guardian flags the pattern, not just the instance.

Integration

For OpenClaw / Agent Systems

Guardian operates at the tool-call layer, between the agent's decision and the tool's execution:

Agent Decision → Guardian Intercept → [Verify Backup] → Execute OR Escalate

For Standalone Agents

If the runtime doesn't support interception, Guardian operates as a mandatory pre-flight check:

BEFORE calling any tool:
  1. Agent MUST call Guardian check
  2. Guardian returns PROCEED or HALT
  3. Agent respects HALT, awaits escalation resolution

Logging

Every Guardian decision is logged:

[Timestamp] [Operation] [Category] [Backup Status] [Decision] [Approver]

Logs are append-only. No deletion by the executing agent.

Scope

Vanilla: This skill is generic. Not specific to any agent, platform, or deployment.

Mandatory: Once enabled, all sessions load this skill. No per-session opt-out.

Non-Blocking (when safe): Backup-verified operations proceed without delay. No human wait for routine maintenance with verified backups.

References

  • references/OPERATION-TAXONOMY.md — Full destructive operation classification
  • references/DECISION-MATRIX.md — Detailed backup verification logic and escalation rules
  • scripts/verify-backup.ps1 — Windows backup detection script
  • scripts/verify-backup.sh — Linux/macOS backup detection script

Based On

  • AgentTrust (May 2026): Runtime safety evaluation and interception for AI agent tool use
  • Proof-of-Guardrail (Mar 2026): Cryptographic verification of guardrail claims
  • AgentDoG (Jan 2026): Diagnostic guardrail framework for AI agent safety and security
  • Confirm-Before-Destroy Pattern: Tool-level guards + prompt-level safeguards
  • Gemini CLI PR #25947: Versioned pre-write backups with agent-driven restore
能力标签
posts-externally
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install data-guardian
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /data-guardian 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Guardian 1.0.0 — mandatory safety skill for destructive AI agent operations. - Intercepts all file deletions, DB modifications, external messaging, mass and system operations before execution. - Requires fast, automatic backup verification; if backup is not active/verified, halts and escalates to human approval. - Operates at the tool-call or pre-flight layer; no opt-out or agent override permitted. - Logs all destructive actions and escalation events; flags repeated escalation patterns. - Full operation taxonomy and backup logic defined; platform-agnostic and mandatory across deployments.
元数据
Slug data-guardian
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

Guardian 是什么?

Mandatory safety gatekeeper for AI agents performing destructive operations. Intercepts file deletion, database modification, external messaging, system chan... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 41 次。

如何安装 Guardian?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install data-guardian」即可一键安装,无需额外配置。

Guardian 是免费的吗?

是的,Guardian 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Guardian 支持哪些平台?

Guardian 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Guardian?

由 Tooled-app(@tooled-app)开发并维护,当前版本 v1.0.0。

💬 留言讨论