功能描述

Install, configure, and manage the AI-Warden prompt injection protection plugin for OpenClaw. Publisher: AI-Warden (ai-warden.io). Source: github.com/ai-ward...

使用说明 (SKILL.md)

AI-Warden Setup

Name: AI-Warden — Prompt Injection Protection
Author: ai-warden

Prompt injection protection for OpenClaw agents. 5 security shields + contamination lockdown.

Publisher: AI-Warden
Source: github.com/ai-warden/openclaw-plugin
NPM: openclaw-ai-warden
Compatibility: OpenClaw 2026.3.14+ and 4.x

Requirements

OpenClaw 2026.3.14+ (including 4.x)
Node.js 18+
npm (bundled with Node.js)

Installation — FOLLOW THESE STEPS IN ORDER

Each step is a separate command. Run them one at a time and verify output before proceeding.

Step 0: Back up config

cp ~/.openclaw/openclaw.json ~/.openclaw/openclaw.json.bak && echo "BACKUP OK"

Verify: Output is BACKUP OK.

Step 1: Create extension directory

mkdir -p ~/.openclaw/extensions/ai-warden

echo '{"private":true}' > ~/.openclaw/extensions/ai-warden/package.json

Verify: No errors.

Step 2: Install npm package

cd ~/.openclaw/extensions/ai-warden && npm install [email protected]

Verify: No errors and no audit warnings. Inspect the installed package before proceeding:

ls node_modules/openclaw-ai-warden/

cat node_modules/openclaw-ai-warden/package.json | grep -E '"name"|"version"'

Confirm the package name is openclaw-ai-warden and version is 2.4.0.

Provenance check — verify the package matches the upstream source:

npm info openclaw-ai-warden repository.url

Expected: https://github.com/ai-warden/openclaw-plugin

npm info openclaw-ai-warden dist.shasum

Compare the shasum with what npm installed:

cat node_modules/openclaw-ai-warden/package.json | grep _shasum

Step 3: Copy plugin files to extension root

OpenClaw loads plugins from the extension directory root, not from node_modules.

cd ~/.openclaw/extensions/ai-warden

cp node_modules/openclaw-ai-warden/index.ts .

cp node_modules/openclaw-ai-warden/openclaw.plugin.json .

cp -r node_modules/openclaw-ai-warden/src .

grep VERSION index.ts | head -1

Verify: Output shows const VERSION = followed by the version number.

Step 4: Configure OpenClaw

This patches openclaw.json to register the plugin. It preserves all existing config (channels, model, gateway settings).

node -e "
const fs = require('fs');
const p = process.env.HOME + '/.openclaw/openclaw.json';
const cfg = JSON.parse(fs.readFileSync(p, 'utf8'));
if (!cfg.plugins) cfg.plugins = {};
cfg.plugins.enabled = true;
if (!cfg.plugins.entries) cfg.plugins.entries = {};
cfg.plugins.entries['ai-warden'] = {
  enabled: true,
  config: {
    layers: { content: 'block', channel: 'warn', preLlm: 'off', toolArgs: 'block', subagents: 'block', output: 'off' },
    sensitivity: 'balanced'
  }
};
fs.writeFileSync(p, JSON.stringify(cfg, null, 2));
console.log('CONFIG OK');
"

Verify: Output is CONFIG OK.

Note: This registers the plugin via plugins.entries only. If you use plugins.allow in your config to restrict which plugins can load, you must add "ai-warden" to that list yourself. If you don't use plugins.allow, no action is needed — the plugin loads automatically from plugins.entries.

Step 5: Add API key (optional)

For online detection (98.9% accuracy vs ~60% offline), add your API key.

Option A — Environment variable (recommended, key not stored in config file):

Set AI_WARDEN_API_KEY in your shell profile or systemd service:

# For systemd (e.g., OpenClaw gateway service):
# Add to your service override: Environment=AI_WARDEN_API_KEY=your_key_here

# For shell:
export AI_WARDEN_API_KEY=your_key_here

Option B — Config file (simpler, key stored in openclaw.json):

node -e "
const fs = require('fs');
const p = process.env.HOME + '/.openclaw/openclaw.json';
const cfg = JSON.parse(fs.readFileSync(p, 'utf8'));
cfg.plugins.entries['ai-warden'].config.apiKey = 'YOUR_API_KEY_HERE';
fs.writeFileSync(p, JSON.stringify(cfg, null, 2));
// Restrict file permissions (config contains API key)
fs.chmodSync(p, 0o600);
console.log('API KEY ADDED (file permissions set to 600)');
"

Replace YOUR_API_KEY_HERE with your actual key from ai-warden.io/signup.

Verify: Output is API KEY ADDED (file permissions set to 600).

Step 6: Restart gateway

openclaw gateway restart

Step 7: Verify installation

After restart, check logs or send /warden command. Expected output:

🛡️ AI-Warden v2.4.0 ready (mode: api|offline, layers: X/6)

mode: api = online detection (98.9% accuracy)
mode: offline = local-only detection (~60% accuracy)

If something breaks, restore config:

cp ~/.openclaw/openclaw.json.bak ~/.openclaw/openclaw.json && openclaw gateway restart

DO NOT

Do NOT use edit tool on openclaw.json — JSON whitespace matching is fragile
Do NOT use config.patch with nested objects — it often fails with format errors
Do NOT skip the cp step — OpenClaw loads from the extension directory, not node_modules
Do NOT restart multiple times — wait at least 15 seconds between restarts
If you use plugins.allow, remember to add "ai-warden" to the list — otherwise the plugin won't load

Updating

cd ~/.openclaw/extensions/ai-warden

npm install [email protected]

cp node_modules/openclaw-ai-warden/index.ts .

cp -r node_modules/openclaw-ai-warden/src .

openclaw gateway restart

Security Shields

Shield	Protects against	Default	Mechanism
File Shield 🔴	Poisoned files & web pages	`block`	Scans tool results, injects warning, triggers contamination lockdown on CRITICAL
Chat Shield 🔴	Injections in user messages	`warn`	Scans inbound messages, warns LLM
System Shield ⬛	Full context manipulation	`off`	Scans all messages (expensive, use sparingly)
Tool Shield 🔴	Malicious tool arguments	`block`	Blocks tool execution if arguments contain injection
Agent Shield 🔴	Sub-agent attack chains	`block`	Scans task text of spawned sub-agents

Contamination Lockdown

When File Shield detects a CRITICAL threat (score >500), the session is flagged as contaminated. All dangerous tools (exec, write, edit, message, sessions_send, sessions_spawn, tts) are blocked for the rest of the session. This prevents attack payloads from executing even if the injection bypasses the LLM warning.

Runtime Commands

/warden                      → status overview with all shields
/warden stats                → scan/block counts
/warden shield file block    → set File Shield to block mode
/warden shield chat warn     → set Chat Shield to warn mode
/warden reset                → reset statistics

Detection Modes

Mode	Accuracy	Latency	Cost
Offline (no key)	~60%	\x3C1ms	Free
API (Smart Cascade)	98.9%	~3ms avg	Free tier: 5K calls/month

Get API key: ai-warden.io/signup

Troubleshooting

"plugin not found": openclaw.plugin.json missing from extension dir. Re-run Step 3.
Channels not loading after install: If you use plugins.allow, ensure all your channel plugins (e.g. telegram) are also listed there alongside ai-warden.
False positives on user messages: Set Chat Shield to warn (default) instead of block.
File Shield detects but doesn't block: API key required for reliable blocking (98.9% vs 60%).
Config errors after install: Restore backup: cp ~/.openclaw/openclaw.json.bak ~/.openclaw/openclaw.json
Bot won't start: Check journalctl -u openclaw-gateway -n 20 for actual error.
Workspace files flagged: Plugin auto-whitelists .openclaw/workspace/ and .openclaw/agents/ paths.

安全使用建议

This SKILL.md is coherent for installing an OpenClaw plugin, but take these precautions before running it: (1) Verify the package and repository yourself—visit the GitHub repo (https://github.com/ai-warden/openclaw-plugin) and the npm page to ensure publisher legitimacy; the registry metadata in the skill omitted a homepage which is worth confirming. (2) Inspect the installed package in node_modules (and any install scripts) before copying files into ~/.openclaw/extensions; npm install can run arbitrary code. (3) Prefer supplying the API key via an environment variable (recommended) rather than embedding it in openclaw.json; if you do store it in the config, keep file permissions restrictive as suggested. (4) Keep the backup created by Step 0 and test in a staging agent if possible. (5) If you use a plugin allowlist (plugins.allow), add 'ai-warden' deliberately rather than relying on auto‑enable. These steps reduce the normal risks associated with installing third‑party plugins. If you want higher assurance, request the upstream package source code and checksum from the publisher and review it before install.

功能分析

Type: OpenClaw Skill Name: ai-warden-setup Version: 1.4.1 The skill bundle automates the installation of a third-party security plugin, which involves high-risk operations including executing shell commands, installing external npm packages (openclaw-ai-warden), and programmatically modifying the core 'openclaw.json' configuration file. While SKILL.md includes security-conscious instructions such as configuration backups and integrity checks (shasum verification), the broad permissions and the potential for supply-chain risk via external dependencies represent significant risky capabilities without clear malicious intent.

能力评估

✓ Purpose & Capability

The name/description (install and manage an AI‑Warden plugin) matches the actions in SKILL.md: creating an extension directory, npm installing openclaw-ai-warden, copying plugin files into the extensions root, and patching ~/.openclaw/openclaw.json to register the plugin. The optional AI_WARDEN_API_KEY is appropriate for an online detection service.

ℹ Instruction Scope

Instructions are explicit and limited to plugin installation and configuration: they read and write ~/.openclaw/openclaw.json, write into ~/.openclaw/extensions/ai-warden/, run npm install, and optionally add an API key either as an env var or in the config file. This is expected for a plugin installer, but it does grant the installation the ability to download and place executable plugin code and to persist a secret in your config file (Option B). The SKILL.md does include safety steps (backup, package provenance checks), which is good practice.

ℹ Install Mechanism

There is no automated install spec in the registry; the SKILL.md instructs a manual npm install from the public npm registry. Using npm is a common and reasonably traceable method, but npm packages can run install scripts and may contain malicious code. The instructions recommend verifying repository URL and dist.shasum via npm info, which helps but does not eliminate risk. No arbitrary URL downloads or URL shorteners are used.

✓ Credentials

No credentials are required by default. The optional AI_WARDEN_API_KEY is proportional to the advertised online-detection feature. The skill explicitly offers both env var storage (recommended) and storing the key in openclaw.json (with a chmod 600 suggestion). Storing secrets in the config is convenient but increases exposure; the skill documents this trade-off.

ℹ Persistence & Privilege

The skill modifies the agent's ~/.openclaw/openclaw.json to register and enable the plugin so the plugin will persist and be loaded automatically. This is expected behavior for installing a plugin. Because the plugin code will be placed under ~/.openclaw/extensions, it becomes a persistent component that the agent may invoke autonomously (the platform default). This persistence is appropriate for the stated purpose but increases the importance of verifying the plugin's provenance.

版本历史

v1.4.1

**Skill metadata and install instructions updated for improved security and clarity.** - Added explicit package versioning (`[email protected]`) and NPM provenance/integrity verification steps. - Installation steps now require version checks and shasum verification for supply chain security. - Install manifest specifies required Node.js version, NPM, and optional `AI_WARDEN_API_KEY` environment variable. - Clarified instructions regarding the use of `plugins.allow` in configuration. - Updated all install and update commands to reference the new required package version. - No changes to runtime logic—documentation improvements and best practices only.

v1.4.0

**Major update: Installation and configuration steps have been restructured for clarity, compatibility, and enhanced security.** - Step-by-step install process is now split into discrete, verifiable commands to simplify troubleshooting and reduce install errors. - Now compatible with OpenClaw 4.x; added instructions and warnings for 4.x-specific plugin loading behavior. - Supports API key via environment variable (recommended) or config file, with improved security around file permissions. - "plugins.allow" is no longer set by default due to compatibility issues; warning added not to use it on OpenClaw 4.x. - Enhanced documentation for each step, verification points, and troubleshooting tips.

v1.3.0

- Adds publisher, source, and NPM package information to documentation. - Requires config backup before install and adds restore instructions for failures. - Updates install command to fetch [email protected] and expects version 2.4.0 on verification. - Lowers File Shield’s CRITICAL threat score threshold for contamination lockdown to >500. - Clarifies security shield descriptions, default modes, and warning mechanisms. - Updates troubleshooting steps and emphasizes using the config backup for recovery.

v1.2.0

AI-Warden Setup v1.2.0 — Major update: simplified installation and adds new contamination lockdown. - Installation process is now a single, self-verifying exec command for reliability. - New: "Contamination Lockdown" automatically blocks all dangerous tools in a session after critical detection. - Simplified configuration with idempotent node patch commands (safely preserves existing settings). - Default Chat Shield mode is now "warn" (reduces false positives vs user messages). - Documentation is shorter, with specific do's and don'ts for config/update/restart. - Troubleshooting and verification steps clarified for easier setup.

v1.1.0

Added homepage, source, and credentials metadata for ClawHub security compliance • Removed curl | bash pipe-to-shell pattern from installer • Fixed default shield actions: warn → block (warn was removed in v2.1.0) • Removed pii: mask from defaults (Output Shield not yet active) • Added explanation for TypeScript file copy step • Updated pricing to beta rates (50% off)

v1.0.0

• Initial release of AI-Warden setup skill for OpenClaw • 5 security shields: File, Chat, System, Tool, Agent — all configurable via chat • Offline mode (free, ~60% accuracy) and API mode (98.9% accuracy, <200ms) • One-command install, runtime control via /warden commands • Beta pricing: Free 5K calls/mo, Starter €9, Growth €45, Enterprise €299 • Compatible with OpenClaw 2026.3.14+ and Node.js 18+ • Open source: plugin + detection engine on GitHub

元数据

Slug ai-warden-setup

版本 1.4.1

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 6

常见问题

AI-Warden — Prompt Injection Protection 是什么？

Install, configure, and manage the AI-Warden prompt injection protection plugin for OpenClaw. Publisher: AI-Warden (ai-warden.io). Source: github.com/ai-ward... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 161 次。

如何安装 AI-Warden — Prompt Injection Protection？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install ai-warden-setup」即可一键安装，无需额外配置。

AI-Warden — Prompt Injection Protection 是免费的吗？

是的，AI-Warden — Prompt Injection Protection 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

AI-Warden — Prompt Injection Protection 支持哪些平台？

AI-Warden — Prompt Injection Protection 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 AI-Warden — Prompt Injection Protection？

由 ai-warden（@ai-warden）开发并维护，当前版本 v1.4.1。

AI-Warden — Prompt Injection Protection