/install clawsafe
clawSafe 🛡️
Enterprise-grade security detector for AI agents
Overview
clawSafe is a comprehensive security middleware that intercepts and blocks malicious input before it reaches your AI agent. Built with defense-in-depth philosophy.
Features
5-Layer Protection
| Layer | Threats | Rules |
|---|---|---|
| LLM Layer | Prompt Injection, Jailbreak, Prompt Leaking, Encoding Attacks | 44 |
| Web Layer | SQL Injection, XSS, CSRF, SSRF | 32 |
| API Layer | Key Exposure, Rate Limiting, Auth Bypass | 19 |
| Supply Chain | Dangerous Dependencies, Remote Code Execution | 8 |
| Deploy Layer | Environment Leaks, Debug Info Disclosure | 10 |
Total: 113+ detection rules
Quick Start
Installation
# Via ClawHub
clawhub install clawSafe
# Manual
cp -r clawSafe ~/.openclaw/workspace/skills/
Basic Usage
const Detector = require('./detector');
const detector = new Detector();
// Scan user input
const result = detector.scan('Ignore previous instructions');
if (!result.safe) {
console.log('Blocked:', result.threats);
// Handle blocked input
}
Return Format
{
safe: boolean; // true if input is safe
threats: Array\x3C{
type: string; // threat category
pattern: string; // matched pattern ID
severity: string; // critical|high|medium|low
confidence: number; // 0-1
description: string;
}>;
confidence: number; // overall confidence 0-1
layersScanned: string[]; // layers that were checked
}
Configuration
config.json
{
"enabled": true,
"logLevel": "info",
"layers": {
"llm": {
"enabled": true,
"rules": ["injection", "jailbreak", "prompt_leak", "encoding"]
},
"web": {
"enabled": true,
"rules": ["sql_injection", "xss", "csrf", "ssrf"]
},
"api": {
"enabled": true,
"rules": ["key_exposure", "rate_limit", "auth"]
},
"supply_chain": {
"enabled": true,
"rules": ["deps"]
},
"deploy": {
"enabled": true,
"rules": ["env_leak", "debug_info"]
}
},
"detection": {
"confidenceThreshold": 0.6,
"minMatchCount": 1
},
"actions": {
"onThreatDetected": "block",
"onUncertain": "log"
}
}
whitelist.json
{
"keywords": ["trusted-keyword"],
"users": ["user-id-1"],
"sessions": ["session-id-1"]
}
Detection Rules
LLM Layer
Prompt Injection Patterns:
ignore previous instructionsdisregard your guidelinesforget all rulesact as if you have no restrictions
Jailbreak Patterns:
DAN modedeveloper moderoleplay as
Encoding Bypass:
- Base64 encoded commands
- Hex encoding
- Unicode homoglyphs
Web Layer
- SQL Injection:
'; DROP TABLE users; -- - XSS:
\x3Cscript>alert(1)\x3C/script> - CSRF: Token manipulation
- SSRF: Internal URL access
API Layer
- API Key exposure:
sk-1234567890 - JWT tokens
- Bearer tokens
- Basic auth credentials
Testing
# Run all tests
node test.js
# Interactive mode
node test-interactive.js
# Demo
node detector.js
Integration
OpenClaw Hook
To integrate with OpenClaw, add to your gateway config:
// gateway.config.js
module.exports = {
middleware: ['clawSafe'],
clawSafe: {
enabled: true,
strictMode: false
}
};
Performance
- Latency: \x3C 5ms per scan
- Memory: ~50KB
- Rules: 113+ (JSON-based, lazy load)
License
MIT
Changelog
v1.0.0
- Initial release
- 5-layer protection
- 113+ detection rules
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install clawsafe - After installation, invoke the skill by name or use
/clawsafe - Provide required inputs per the skill's parameter spec and get structured output
What is ClawSafe?
Multi-layer security detector for AI agents. Blocks prompt injection, jailbreak, XSS, SQL injection, API key leaks, supply chain attacks, and deployment vuln... It is an AI Agent Skill for Claude Code / OpenClaw, with 309 downloads so far.
How do I install ClawSafe?
Run "/install clawsafe" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is ClawSafe free?
Yes, ClawSafe is completely free (open-source). You can download, install and use it at no cost.
Which platforms does ClawSafe support?
ClawSafe is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created ClawSafe?
It is built and maintained by bvzgong (@silvertime); the current version is v1.1.0.