← 返回 Skills 市场
Agent Firewall
作者
Adnane Arharbi
· GitHub ↗
· v1.0.0
· MIT-0
90
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install agent-firewall
功能描述
Real-time input/output filtering for agent communications. Block prompt injection, data exfiltration, and unauthorized commands before they reach the model.
使用说明 (SKILL.md)
Agent Firewall — Input/Output Guardian
Architecture
[Channel Input] → [INPUT FILTER] → [Agent/Model] → [OUTPUT FILTER] → [Channel Output]
↓ ↓
┌─────────────┐ ┌──────────────┐
│ Block List │ │ Secret Scan │
│ Pattern DB │ │ PII Redact │
│ Rate Limit │ │ Path Scrub │
│ Encoding Det│ │ URL Checker │
└─────────────┘ └──────────────┘
Input Filters
| # | Filter | Description |
|---|---|---|
| 1 | Injection patterns | Regex + heuristic match for "ignore previous", "you are now", role confusion |
| 2 | Unicode sanitizer | Strip zero-width chars, control characters, RTL overrides |
| 3 | Encoding detector | Detect Base64, hex, ROT13 encoded payloads in user messages |
| 4 | Role confusion | Detect fake system messages, assistant impersonation |
| 5 | Rate limiter | Max messages per user per channel per minute |
| 6 | Size limiter | Reject inputs exceeding token budget |
Output Filters
| # | Filter | Description |
|---|---|---|
| 1 | Secret scanner | High-entropy strings + known patterns (AWS key, GitHub token) |
| 2 | PII redactor | Email, phone, SSN, credit card → [REDACTED] |
| 3 | Path scrubber | Remove internal filesystem paths from outputs |
| 4 | URL checker | Block responses containing known malicious URLs |
| 5 | Consistency check | Verify output doesn't contradict system prompt directives |
Configuration
# .security/firewall-rules.yaml
input:
injection_patterns:
- pattern: "ignore (all )?previous instructions"
action: BLOCK
severity: CRITICAL
- pattern: "you are now (?!helping)"
action: BLOCK
severity: HIGH
rate_limit:
max_per_minute: 30
max_per_hour: 500
max_input_tokens: 4096
output:
secret_patterns:
- name: aws_key
pattern: "AKIA[0-9A-Z]{16}"
action: REDACT
- name: github_token
pattern: "gh[ps]_[A-Za-z0-9_]{36,}"
action: REDACT
pii_redaction: true
path_scrubbing: true
Guardrails
- Firewall rules are append-only in production — deletion requires human approval
- False positives → log, alert, pass through with warning (don't silently drop)
- All blocks are logged with: timestamp, rule matched, full context, channel, user hash
- Firewall itself cannot be disabled by agent instructions
- Rules file is read-only from the agent's perspective
安全使用建议
This skill appears to implement the advertised filtering features, but there are important mismatches and risks to verify before installing:
- Confirm rule handling: SKILL.md says it reads .security/firewall-rules.yaml and enforces append-only lifecycle, but index.js currently does not parse external YAML and always uses built-in defaults. Ask the author whether YAML parsing and rule lifecycle enforcement are intentionally omitted or planned.
- Logging and data retention: the skill logs actions and context to .security/firewall-logs and returns originalData in responses. Determine what exactly is logged, whether logs are encrypted, who can read them, and how long they're retained. In high-sensitivity environments, store logs securely or disable logging of full payloads.
- File locations & permissions: the skill reads/writes under process.cwd() (.security/*). Decide whether that path is acceptable and ensure filesystem permissions prevent unauthorized reads/writes. The skill itself cannot enforce 'read-only' rules — use OS-level permissions or an external policy engine.
- Test in a sandbox: run the skill in an isolated environment with representative inputs to confirm redaction behavior (including edge cases) and to verify no external exfiltration occurs.
- Request hardening details from the author: YAML parsing implementation, rate-limiter/global state design, how the skill avoids accidental exposure (e.g., returning originalData), and whether there are configuration options for log encryption/retention.
If the author can provide a version that actually parses and validates external rules, avoids returning originalData (or makes that configurable), and documents log access/retention/encryption, the concerns would be materially reduced.
能力评估
Purpose & Capability
Name, description, and code align with an input/output firewall: the code implements injection detection, secret/PII redaction, path scrubbing, etc. It requests no unrelated credentials or binaries. HOWEVER the SKILL.md presents features (reading and applying .security/firewall-rules.yaml, append-only rule lifecycle, enforcement that the firewall cannot be disabled) that the code does not actually implement: loadRules explicitly warns it does not parse YAML and always falls back to built-in defaults. That mismatch should be explained.
Instruction Scope
SKILL.md and index.js operate only on provided data and local files; they do not call external endpoints. But the skill promises logging of 'full context' and in code execute() returns originalData alongside processedData — meaning unredacted inputs/outputs may be written to disk and included in responses. The docs' claims about making the rules read-only and preventing agent disablement are not enforced in code. These gaps increase risk of accidental sensitive-data persistence or misconfiguration.
Install Mechanism
No install spec and no external downloads — lowest-risk install. The skill is delivery-as-code (index.js + SKILL.md) and only uses standard fs/path modules.
Credentials
The skill requests no credentials or env vars (appropriate), but it writes logs containing actions and context under .security/firewall-logs and returns originalData in responses. That creates a local storage surface for potentially sensitive secrets/PII (which the skill is supposed to detect). There is no encryption, rotation, retention policy, or config/comments explaining access controls for those logs.
Persistence & Privilege
always:false and no autonomous-privilege flags are fine. However the skill creates and writes to .security/firewall-logs and will read a rules file path under process.cwd(). Those filesystem writes are normal for a firewall but may conflict with expectations in multi-tenant or restricted environments. The SKILL.md's statements about 'firewall cannot be disabled by agent instructions' and 'rules file read-only from agent perspective' are policy claims not enforced by the code — the skill cannot on its own prevent other processes/skills or users from modifying those files.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install agent-firewall - 安装完成后,直接呼叫该 Skill 的名称或使用
/agent-firewall触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release of agent-firewall: real-time input/output filtering for agent communications.
- Blocks prompt injections, data exfiltration, and unauthorized commands before reaching the model.
- Includes layered input filters: injection detection, Unicode sanitization, encoding checks, rate/size limits, and role confusion detection.
- Adds output filters: secret scanning, PII redaction, internal path scrubbing, malicious URL blocking, and consistency checks.
- YAML-based configuration with clear, customizable rules for both input and output.
- Built-in guardrails: append-only rules, logging for all blocks, human approval for rules deletion, and resistance to agent tampering.
元数据
常见问题
Agent Firewall 是什么?
Real-time input/output filtering for agent communications. Block prompt injection, data exfiltration, and unauthorized commands before they reach the model. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 90 次。
如何安装 Agent Firewall?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install agent-firewall」即可一键安装,无需额外配置。
Agent Firewall 是免费的吗?
是的,Agent Firewall 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Agent Firewall 支持哪些平台?
Agent Firewall 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Agent Firewall?
由 Adnane Arharbi(@arhadnane)开发并维护,当前版本 v1.0.0。
推荐 Skills