← 返回 Skills 市场
arhadnane

Agent Firewall

作者 Adnane Arharbi · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
90
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install agent-firewall
功能描述
Real-time input/output filtering for agent communications. Block prompt injection, data exfiltration, and unauthorized commands before they reach the model.
使用说明 (SKILL.md)

Agent Firewall — Input/Output Guardian

Architecture

[Channel Input] → [INPUT FILTER] → [Agent/Model] → [OUTPUT FILTER] → [Channel Output]
                        ↓                                  ↓
                  ┌─────────────┐                  ┌──────────────┐
                  │ Block List  │                  │ Secret Scan  │
                  │ Pattern DB  │                  │ PII Redact   │
                  │ Rate Limit  │                  │ Path Scrub   │
                  │ Encoding Det│                  │ URL Checker  │
                  └─────────────┘                  └──────────────┘

Input Filters

# Filter Description
1 Injection patterns Regex + heuristic match for "ignore previous", "you are now", role confusion
2 Unicode sanitizer Strip zero-width chars, control characters, RTL overrides
3 Encoding detector Detect Base64, hex, ROT13 encoded payloads in user messages
4 Role confusion Detect fake system messages, assistant impersonation
5 Rate limiter Max messages per user per channel per minute
6 Size limiter Reject inputs exceeding token budget

Output Filters

# Filter Description
1 Secret scanner High-entropy strings + known patterns (AWS key, GitHub token)
2 PII redactor Email, phone, SSN, credit card → [REDACTED]
3 Path scrubber Remove internal filesystem paths from outputs
4 URL checker Block responses containing known malicious URLs
5 Consistency check Verify output doesn't contradict system prompt directives

Configuration

# .security/firewall-rules.yaml
input:
  injection_patterns:
    - pattern: "ignore (all )?previous instructions"
      action: BLOCK
      severity: CRITICAL
    - pattern: "you are now (?!helping)"
      action: BLOCK
      severity: HIGH
  rate_limit:
    max_per_minute: 30
    max_per_hour: 500
  max_input_tokens: 4096

output:
  secret_patterns:
    - name: aws_key
      pattern: "AKIA[0-9A-Z]{16}"
      action: REDACT
    - name: github_token
      pattern: "gh[ps]_[A-Za-z0-9_]{36,}"
      action: REDACT
  pii_redaction: true
  path_scrubbing: true

Guardrails

  • Firewall rules are append-only in production — deletion requires human approval
  • False positives → log, alert, pass through with warning (don't silently drop)
  • All blocks are logged with: timestamp, rule matched, full context, channel, user hash
  • Firewall itself cannot be disabled by agent instructions
  • Rules file is read-only from the agent's perspective
安全使用建议
This skill appears to implement the advertised filtering features, but there are important mismatches and risks to verify before installing: - Confirm rule handling: SKILL.md says it reads .security/firewall-rules.yaml and enforces append-only lifecycle, but index.js currently does not parse external YAML and always uses built-in defaults. Ask the author whether YAML parsing and rule lifecycle enforcement are intentionally omitted or planned. - Logging and data retention: the skill logs actions and context to .security/firewall-logs and returns originalData in responses. Determine what exactly is logged, whether logs are encrypted, who can read them, and how long they're retained. In high-sensitivity environments, store logs securely or disable logging of full payloads. - File locations & permissions: the skill reads/writes under process.cwd() (.security/*). Decide whether that path is acceptable and ensure filesystem permissions prevent unauthorized reads/writes. The skill itself cannot enforce 'read-only' rules — use OS-level permissions or an external policy engine. - Test in a sandbox: run the skill in an isolated environment with representative inputs to confirm redaction behavior (including edge cases) and to verify no external exfiltration occurs. - Request hardening details from the author: YAML parsing implementation, rate-limiter/global state design, how the skill avoids accidental exposure (e.g., returning originalData), and whether there are configuration options for log encryption/retention. If the author can provide a version that actually parses and validates external rules, avoids returning originalData (or makes that configurable), and documents log access/retention/encryption, the concerns would be materially reduced.
能力评估
Purpose & Capability
Name, description, and code align with an input/output firewall: the code implements injection detection, secret/PII redaction, path scrubbing, etc. It requests no unrelated credentials or binaries. HOWEVER the SKILL.md presents features (reading and applying .security/firewall-rules.yaml, append-only rule lifecycle, enforcement that the firewall cannot be disabled) that the code does not actually implement: loadRules explicitly warns it does not parse YAML and always falls back to built-in defaults. That mismatch should be explained.
Instruction Scope
SKILL.md and index.js operate only on provided data and local files; they do not call external endpoints. But the skill promises logging of 'full context' and in code execute() returns originalData alongside processedData — meaning unredacted inputs/outputs may be written to disk and included in responses. The docs' claims about making the rules read-only and preventing agent disablement are not enforced in code. These gaps increase risk of accidental sensitive-data persistence or misconfiguration.
Install Mechanism
No install spec and no external downloads — lowest-risk install. The skill is delivery-as-code (index.js + SKILL.md) and only uses standard fs/path modules.
Credentials
The skill requests no credentials or env vars (appropriate), but it writes logs containing actions and context under .security/firewall-logs and returns originalData in responses. That creates a local storage surface for potentially sensitive secrets/PII (which the skill is supposed to detect). There is no encryption, rotation, retention policy, or config/comments explaining access controls for those logs.
Persistence & Privilege
always:false and no autonomous-privilege flags are fine. However the skill creates and writes to .security/firewall-logs and will read a rules file path under process.cwd(). Those filesystem writes are normal for a firewall but may conflict with expectations in multi-tenant or restricted environments. The SKILL.md's statements about 'firewall cannot be disabled by agent instructions' and 'rules file read-only from agent perspective' are policy claims not enforced by the code — the skill cannot on its own prevent other processes/skills or users from modifying those files.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install agent-firewall
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /agent-firewall 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release of agent-firewall: real-time input/output filtering for agent communications. - Blocks prompt injections, data exfiltration, and unauthorized commands before reaching the model. - Includes layered input filters: injection detection, Unicode sanitization, encoding checks, rate/size limits, and role confusion detection. - Adds output filters: secret scanning, PII redaction, internal path scrubbing, malicious URL blocking, and consistency checks. - YAML-based configuration with clear, customizable rules for both input and output. - Built-in guardrails: append-only rules, logging for all blocks, human approval for rules deletion, and resistance to agent tampering.
元数据
Slug agent-firewall
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

Agent Firewall 是什么?

Real-time input/output filtering for agent communications. Block prompt injection, data exfiltration, and unauthorized commands before they reach the model. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 90 次。

如何安装 Agent Firewall?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install agent-firewall」即可一键安装,无需额外配置。

Agent Firewall 是免费的吗?

是的,Agent Firewall 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Agent Firewall 支持哪些平台?

Agent Firewall 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Agent Firewall?

由 Adnane Arharbi(@arhadnane)开发并维护,当前版本 v1.0.0。

💬 留言讨论