/install guard
AI Guardrails (Deep Workflow)
Guardrails turn product and legal policy into enforced behavior: blocking, rewriting, logging, and human review—with attention to false positives and latency.
When to Offer This Workflow
Trigger conditions:
- Launching consumer-facing LLM features
- Jailbreak attempts, policy violations, or PII leakage risks
- Region-specific compliance (minors, regulated advice)
Initial offer:
Use six stages: (1) policy scope, (2) threat model, (3) controls stack, (4) implementation patterns, (5) monitoring & review, (6) iteration & appeals). Confirm latency budget and jurisdictions.
Stage 1: Policy Scope
Goal: Define prohibited categories (hate, sexual content, violence, self-harm, malware instructions, etc.) and required disclaimers for sensitive domains (medical, legal).
Exit condition: Policy document owned by legal/product; escalation path for gray areas.
Stage 2: Threat Model
Goal: Identify adversaries (prompt injection, data exfiltration, tool abuse) and assets (user data, system prompts, connectors).
Stage 3: Controls Stack
Goal: Layer defenses: input screening, model safety APIs, output classifiers, tool sandboxing, allowlists for tools and URLs.
Stage 4: Implementation Patterns
Goal: Structured refusal messages; telemetry on every block; distinguish block vs rewrite vs warn; avoid silent failures.
Stage 5: Monitoring & Review
Goal: Sample borderline cases for human review; dashboards on block rates by category; abuse spike alerts.
Stage 6: Iteration & Appeals
Goal: User appeals path where appropriate; version policy changes; measure false positives by locale and use case.
Final Review Checklist
- Policy categories and owners defined
- Threat model aligned with product
- Layered controls with clear responsibilities
- Telemetry and review for edge cases
- Appeals and iteration process where applicable
Tips for Effective Guidance
- Defense in depth—no single classifier is sufficient.
- Pair with moderation for UGC and tool-calling for agent safety.
Handling Deviations
- Enterprise internal bots: emphasize data-leak prevention and connector scope over public “safety” categories alone.
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install guard - 安装完成后,直接呼叫该 Skill 的名称或使用
/guard触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
Guard 是什么?
Deep AI safety guardrails workflow—policy definition, input/output filtering, monitoring, escalation, and false-positive handling. Use when reducing harmful... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 251 次。
如何安装 Guard?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install guard」即可一键安装,无需额外配置。
Guard 是免费的吗?
是的,Guard 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Guard 支持哪些平台?
Guard 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Guard?
由 clawkk(@clawkk)开发并维护,当前版本 v1.0.0。