← 返回 Skills 市场
adrianteng

Prompt Injection Defense

作者 AdrianTeng · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
126
总下载
0
收藏
1
当前安装
1
版本数
在 OpenClaw 中安装
/install prompt-injection-defense
功能描述
Harden agent sessions against prompt injection from untrusted content. Use when the agent reads web search results, emails, downloaded files, PDFs, or any ex...
安全使用建议
This skill appears to do what it says: tag untrusted outputs, scan them for prompt-injection patterns, and quarantine or accept content before writing to memory. Before installing, consider: (1) set OPENCLAW_WORKSPACE explicitly if you don't want files in your home directory; review filesystem permissions on that workspace. (2) Do not allow the agent to construct shell commands from untrusted input and then pass them to tag-untrusted.sh (that script will execute whatever command you give it). (3) Regularly review the quarantine directory for false positives and for any sensitive data captured there. (4) Treat the scanner as a defense-in-depth tool — it can miss sophisticated attacks; combine with read-only API permissions and human review for risky actions. If you want higher assurance, audit the scripts locally and run them in a sandboxed environment first.
功能分析
Type: OpenClaw Skill Name: prompt-injection-defense Version: 1.0.0 The bundle is a defensive security toolkit designed to protect OpenClaw agents from prompt injection attacks. It implements a multi-layered defense strategy including content tagging (scripts/tag-untrusted.sh), heuristic-based scanning for adversarial patterns (scripts/scan-content.py), and a gated memory-writing pipeline (scripts/safe-memory-write.sh) that quarantines suspicious input. The instructions in SKILL.md and the logic in the scripts are consistently aligned with the stated purpose of hardening the agent against external adversarial content.
能力评估
Purpose & Capability
Name/description match the provided assets: SKILL.md documents tagging, scanning, memory guardrails and canaries; scripts implement scanning (scan-content.py), safe memory writes (safe-memory-write.sh), and tagging (tag-untrusted.sh). No unrelated credentials, binaries, or install steps are requested.
Instruction Scope
Runtime instructions are focused on scanning/tagging/quarantine. tag-untrusted.sh runs an arbitrary command and echoes its output wrapped in tags — this is expected for capturing tool output, but be careful: do not pass untrusted user-supplied strings as executable commands (that would execute them). The SKILL.md itself contains the injection phrases the scanner looks for (hence pre-scan hits); this is expected because the doc teaches detection rules.
Install Mechanism
Instruction-only with small local scripts; no download/install mechanism, package managers, or network fetches embedded in the install. Low installation risk.
Credentials
The skill requests no credentials or required env vars. Scripts write to a workspace path (OPENCLAW_WORKSPACE or default $HOME/.openclaw/workspace) and create memory/quarantine files there — this is consistent with purpose but means the skill will create persistent files on the user's filesystem and may store sanitized or quarantined copies of untrusted content (which could include secrets if such content contained them).
Persistence & Privilege
always:false (not force-installed) and user-invocable:true. The skill writes its own memory/quarantine files (expected). It does not modify other skills or request elevated system privileges.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install prompt-injection-defense
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /prompt-injection-defense 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release focused on agent prompt injection defense. - Adds layered defense scripts: content tagging, scanning, memory write guardrails, and canary pattern detection. - New scripts for tagging untrusted input, scanning for attack patterns, and safely writing to memory. - Includes comprehensive checklist, hardening rules for agents, and practical usage examples. - Provides reference detection patterns and strong usage guidance for handling any untrusted external content. - Replaces the earlier prompt skill with a security-focused module.
元数据
Slug prompt-injection-defense
版本 1.0.0
许可证 MIT-0
累计安装 1
当前安装数 1
历史版本数 1
常见问题

Prompt Injection Defense 是什么?

Harden agent sessions against prompt injection from untrusted content. Use when the agent reads web search results, emails, downloaded files, PDFs, or any ex... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 126 次。

如何安装 Prompt Injection Defense?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install prompt-injection-defense」即可一键安装,无需额外配置。

Prompt Injection Defense 是免费的吗?

是的,Prompt Injection Defense 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Prompt Injection Defense 支持哪些平台?

Prompt Injection Defense 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Prompt Injection Defense?

由 AdrianTeng(@adrianteng)开发并维护,当前版本 v1.0.0。

💬 留言讨论