/install guard
AI Guardrails (Deep Workflow)
Guardrails turn product and legal policy into enforced behavior: blocking, rewriting, logging, and human review—with attention to false positives and latency.
When to Offer This Workflow
Trigger conditions:
- Launching consumer-facing LLM features
- Jailbreak attempts, policy violations, or PII leakage risks
- Region-specific compliance (minors, regulated advice)
Initial offer:
Use six stages: (1) policy scope, (2) threat model, (3) controls stack, (4) implementation patterns, (5) monitoring & review, (6) iteration & appeals). Confirm latency budget and jurisdictions.
Stage 1: Policy Scope
Goal: Define prohibited categories (hate, sexual content, violence, self-harm, malware instructions, etc.) and required disclaimers for sensitive domains (medical, legal).
Exit condition: Policy document owned by legal/product; escalation path for gray areas.
Stage 2: Threat Model
Goal: Identify adversaries (prompt injection, data exfiltration, tool abuse) and assets (user data, system prompts, connectors).
Stage 3: Controls Stack
Goal: Layer defenses: input screening, model safety APIs, output classifiers, tool sandboxing, allowlists for tools and URLs.
Stage 4: Implementation Patterns
Goal: Structured refusal messages; telemetry on every block; distinguish block vs rewrite vs warn; avoid silent failures.
Stage 5: Monitoring & Review
Goal: Sample borderline cases for human review; dashboards on block rates by category; abuse spike alerts.
Stage 6: Iteration & Appeals
Goal: User appeals path where appropriate; version policy changes; measure false positives by locale and use case.
Final Review Checklist
- Policy categories and owners defined
- Threat model aligned with product
- Layered controls with clear responsibilities
- Telemetry and review for edge cases
- Appeals and iteration process where applicable
Tips for Effective Guidance
- Defense in depth—no single classifier is sufficient.
- Pair with moderation for UGC and tool-calling for agent safety.
Handling Deviations
- Enterprise internal bots: emphasize data-leak prevention and connector scope over public “safety” categories alone.
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install guard - After installation, invoke the skill by name or use
/guard - Provide required inputs per the skill's parameter spec and get structured output
What is Guard?
Deep AI safety guardrails workflow—policy definition, input/output filtering, monitoring, escalation, and false-positive handling. Use when reducing harmful... It is an AI Agent Skill for Claude Code / OpenClaw, with 251 downloads so far.
How do I install Guard?
Run "/install guard" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Guard free?
Yes, Guard is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does Guard support?
Guard is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Guard?
It is built and maintained by clawkk (@clawkk); the current version is v1.0.0.