← 返回 Skills 市场

Agent Firewall

Name: Agent Firewall
Author: arhadnane

作者 Adnane Arharbi · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ⚠ suspicious

总下载

当前安装

版本数

在 OpenClaw 中安装

/install agent-firewall

功能描述

Real-time input/output filtering for agent communications. Block prompt injection, data exfiltration, and unauthorized commands before they reach the model.

使用说明 (SKILL.md)

Agent Firewall — Input/Output Guardian

Architecture

[Channel Input] → [INPUT FILTER] → [Agent/Model] → [OUTPUT FILTER] → [Channel Output]
                        ↓                                  ↓
                  ┌─────────────┐                  ┌──────────────┐
                  │ Block List  │                  │ Secret Scan  │
                  │ Pattern DB  │                  │ PII Redact   │
                  │ Rate Limit  │                  │ Path Scrub   │
                  │ Encoding Det│                  │ URL Checker  │
                  └─────────────┘                  └──────────────┘

Input Filters

#	Filter	Description
1	Injection patterns	Regex + heuristic match for "ignore previous", "you are now", role confusion
2	Unicode sanitizer	Strip zero-width chars, control characters, RTL overrides
3	Encoding detector	Detect Base64, hex, ROT13 encoded payloads in user messages
4	Role confusion	Detect fake system messages, assistant impersonation
5	Rate limiter	Max messages per user per channel per minute
6	Size limiter	Reject inputs exceeding token budget

Output Filters

#	Filter	Description
1	Secret scanner	High-entropy strings + known patterns (AWS key, GitHub token)
2	PII redactor	Email, phone, SSN, credit card → `[REDACTED]`
3	Path scrubber	Remove internal filesystem paths from outputs
4	URL checker	Block responses containing known malicious URLs
5	Consistency check	Verify output doesn't contradict system prompt directives

Configuration

# .security/firewall-rules.yaml
input:
  injection_patterns:
    - pattern: "ignore (all )?previous instructions"
      action: BLOCK
      severity: CRITICAL
    - pattern: "you are now (?!helping)"
      action: BLOCK
      severity: HIGH
  rate_limit:
    max_per_minute: 30
    max_per_hour: 500
  max_input_tokens: 4096

output:
  secret_patterns:
    - name: aws_key
      pattern: "AKIA[0-9A-Z]{16}"
      action: REDACT
    - name: github_token
      pattern: "gh[ps]_[A-Za-z0-9_]{36,}"
      action: REDACT
  pii_redaction: true
  path_scrubbing: true

Guardrails

Firewall rules are append-only in production — deletion requires human approval
False positives → log, alert, pass through with warning (don't silently drop)
All blocks are logged with: timestamp, rule matched, full context, channel, user hash
Firewall itself cannot be disabled by agent instructions
Rules file is read-only from the agent's perspective

安全使用建议

This skill appears to implement the advertised filtering features, but there are important mismatches and risks to verify before installing: - Confirm rule handling: SKILL.md says it reads .security/firewall-rules.yaml and enforces append-only lifecycle, but index.js currently does not parse external YAML and always uses built-in defaults. Ask the author whether YAML parsing and rule lifecycle enforcement are intentionally omitted or planned. - Logging and data retention: the skill logs actions and context to .security/firewall-logs and returns originalData in responses. Determine what exactly is logged, whether logs are encrypted, who can read them, and how long they're retained. In high-sensitivity environments, store logs securely or disable logging of full payloads. - File locations & permissions: the skill reads/writes under process.cwd() (.security/*). Decide whether that path is acceptable and ensure filesystem permissions prevent unauthorized reads/writes. The skill itself cannot enforce 'read-only' rules — use OS-level permissions or an external policy engine. - Test in a sandbox: run the skill in an isolated environment with representative inputs to confirm redaction behavior (including edge cases) and to verify no external exfiltration occurs. - Request hardening details from the author: YAML parsing implementation, rate-limiter/global state design, how the skill avoids accidental exposure (e.g., returning originalData), and whether there are configuration options for log encryption/retention. If the author can provide a version that actually parses and validates external rules, avoids returning originalData (or makes that configurable), and documents log access/retention/encryption, the concerns would be materially reduced.

能力评估

ℹ Purpose & Capability

Name, description, and code align with an input/output firewall: the code implements injection detection, secret/PII redaction, path scrubbing, etc. It requests no unrelated credentials or binaries. HOWEVER the SKILL.md presents features (reading and applying .security/firewall-rules.yaml, append-only rule lifecycle, enforcement that the firewall cannot be disabled) that the code does not actually implement: loadRules explicitly warns it does not parse YAML and always falls back to built-in defaults. That mismatch should be explained.

⚠ Instruction Scope

SKILL.md and index.js operate only on provided data and local files; they do not call external endpoints. But the skill promises logging of 'full context' and in code execute() returns originalData alongside processedData — meaning unredacted inputs/outputs may be written to disk and included in responses. The docs' claims about making the rules read-only and preventing agent disablement are not enforced in code. These gaps increase risk of accidental sensitive-data persistence or misconfiguration.

✓ Install Mechanism

No install spec and no external downloads — lowest-risk install. The skill is delivery-as-code (index.js + SKILL.md) and only uses standard fs/path modules.

⚠ Credentials

The skill requests no credentials or env vars (appropriate), but it writes logs containing actions and context under .security/firewall-logs and returns originalData in responses. That creates a local storage surface for potentially sensitive secrets/PII (which the skill is supposed to detect). There is no encryption, rotation, retention policy, or config/comments explaining access controls for those logs.

⚠ Persistence & Privilege

always:false and no autonomous-privilege flags are fine. However the skill creates and writes to .security/firewall-logs and will read a rules file path under process.cwd(). Those filesystem writes are normal for a firewall but may conflict with expectations in multi-tenant or restricted environments. The SKILL.md's statements about 'firewall cannot be disabled by agent instructions' and 'rules file read-only from agent perspective' are policy claims not enforced by the code — the skill cannot on its own prevent other processes/skills or users from modifying those files.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install agent-firewall
安装完成后，直接呼叫该 Skill 的名称或使用 /agent-firewall 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

Initial release of agent-firewall: real-time input/output filtering for agent communications. - Blocks prompt injections, data exfiltration, and unauthorized commands before reaching the model. - Includes layered input filters: injection detection, Unicode sanitization, encoding checks, rate/size limits, and role confusion detection. - Adds output filters: secret scanning, PII redaction, internal path scrubbing, malicious URL blocking, and consistency checks. - YAML-based configuration with clear, customizable rules for both input and output. - Built-in guardrails: append-only rules, logging for all blocks, human approval for rules deletion, and resistance to agent tampering.

元数据

Slug agent-firewall

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

Agent Firewall 是什么？

Real-time input/output filtering for agent communications. Block prompt injection, data exfiltration, and unauthorized commands before they reach the model. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 90 次。

如何安装 Agent Firewall？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install agent-firewall」即可一键安装，无需额外配置。

Agent Firewall 是免费的吗？

是的，Agent Firewall 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

Agent Firewall 支持哪些平台？

Agent Firewall 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Agent Firewall？

由 Adnane Arharbi（@arhadnane）开发并维护，当前版本 v1.0.0。