← 返回 Skills 市场

Prompt Guard

Name: Prompt Guard
Author: seojoonkim

作者 seojoonkim · GitHub ↗ · v3.6.2

cross-platform ⚠ suspicious

12563

总下载

117

当前安装

版本数

在 OpenClaw 中安装

/install prompt-guard

功能描述

650+ pattern AI agent security defense covering prompt injection, supply chain injection, memory poisoning, action gate bypass, unicode steganography, cascad...

安全使用建议

Install only if you are comfortable configuring it explicitly. For sensitive or offline environments, set PG_API_ENABLED=false and disable HiveFence auto-reporting before first use, and consider turning off message-content logging. Treat python3 -m prompt_guard.audit --fix as an administrative host-modification command, not a normal prompt scanner.

功能分析

Type: OpenClaw Skill Name: prompt-guard Version: 3.6.2 The prompt-guard skill bundle is a comprehensive security library for AI agents, providing over 650 detection patterns for prompt injection, data exfiltration, and tool abuse. It features a multi-layered defense-in-depth architecture including text normalization, multi-encoding decoders (decoder.py), and an enterprise-grade DLP system (output.py) for redacting sensitive credentials. The bundle includes an optional API client (api_client.py) and a distributed threat intelligence client (hivefence.py) that report anonymized threat metadata (hashes) to external endpoints (pg-secure-api.vercel.app and hivefence-api.seojoon-kim.workers.dev). A system audit utility (audit.py) is also provided to check for common security misconfigurations. The code is well-structured, includes extensive regression tests, and demonstrates clear intent to protect agents rather than attack them.

能力评估

⚠ Purpose & Capability

The core prompt-injection and DLP scanning behavior is purpose-aligned, but the package also initializes a remote pattern API by default, auto-reports HIGH+ detections to HiveFence by default, and includes Clawdbot host-audit utilities that inspect local and system configuration.

⚠ Instruction Scope

Documentation is inconsistent: prominent text claims 100% offline or optional/off-by-default API behavior, while SKILL.md, ARCHITECTURE.md, config.example.yaml, and engine.py show API-enabled-by-default behavior with a built-in beta key. HiveFence auto-reporting is also less clearly disclosed than the API reporting path.

✓ Install Mechanism

No install-time execution hooks or destructive installation behavior were found. The Python package uses normal metadata, a CLI entry point, and a small dependency set.

⚠ Credentials

Default runtime behavior can contact external endpoints for pattern fetches and threat reports. The package can also write security logs, optionally include message previews, cache HiveFence data under ~/.clawdbot, and read ~/.clawdbot plus /etc/ssh/sshd_config through the audit utility.

⚠ Persistence & Privilege

Persistence is mostly limited to logs and caches, but the audit module has a user-invoked --fix path that changes filesystem permissions with os.chmod and no per-change confirmation after the flag is supplied.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install prompt-guard
安装完成后，直接呼叫该 Skill 的名称或使用 /prompt-guard 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v3.6.2

No code or documentation changes detected in this release. - Version number updated from 3.6.0 to 3.6.2. - No functional or documentation changes present.

v3.6.1

No changes in this release. - Version number updated only. - No file changes detected. - No new features, fixes, or updates introduced.

v3.6.0

**v3.6.0 expands coverage to 650+ patterns with new ClawSecurity-aligned detections and attack categories.** - Added 50+ new detection patterns, including ClawHavoc supply chain signatures, cloud credentials exfiltration, and code exfiltration defense. - Introduced detection for multi-turn manipulation, authority escalation (e.g., emergency override, sudo grant), and PII output such as SSN and credit cards. - Enhanced protection against config drift, large data dumps, SQL injection via tool parameters, and path traversal attacks. - High and medium tiers expanded with new checks for financial data, cross-session attacks, and tool/agent parameter abuse. - Pattern set updated: now covers prompt injection, supply chain threats, memory poisoning, unicode steganography, cascade amplification, and more across 12 security categories and 10 languages.

v3.5.0

v3.4.0/3.5.0: Typo-based evasion detection + TieredPatternLoader fix (PR #10 by @matthew-a-gordon). 14 new regression tests. Drop-in LLM prompt injection defense.

v3.4.0

v3.4.0: AI Recommendation Poisoning, Calendar Injection, PAP Social Engineering

v3.3.0

**v3.3.0 adds optional API support with early-access and premium pattern tiers.** - Introduced API client for early-access and premium pattern updates (optional; uses built-in beta key, can be disabled). - Bundled pattern set expanded to 577+; now includes advanced "skill weaponization" patterns for deeper threat coverage. - Switchable between fully offline (no API requests) and API-enhanced detection via config or environment variable. - Improved documentation: clarified API usage, pattern tiers, config, and CLI; security categories and feature set updated. - New and updated tests for typo evasion and API behavior. - Internal code updates in engine, scanner, and patterns to support API integration and expanded tier logic.

v3.1.0

Token Optimization: 70% reduction via tiered loading, 90% cache savings, SKILL.md 65% smaller

v2.6.1

# prompt-guard v2.6.1 Changelog - Updated changelog and documentation. - Minor adjustments in scripts/detect.py (details not specified in input).

v2.5.3

**prompt-guard v2.6.0 – Major update: HiveFence distributed threat intelligence and new real-world defenses** - Integrated with HiveFence: agents share and receive new attack patterns via collective defense network. - New CLI tools for reporting, voting, and syncing threat patterns with HiveFence. - Additional defenses against social engineering, including single-approval expansion, credential path harvest, and security bypass coaching. - Owner-only restrictions now enforced for sensitive commands in DMs as well as group chats. - ARCHITECTURE.md added; major documentation updates.

v2.5.2

Moltbook attack collection: BRC-20 JSON injection, guardrail bypass, agent sovereignty manipulation, CALL TO ACTION detection

v2.5.1

prompt-guard v2.5.1 - Added critical detection for LLM system prompt mimicry (e.g. fake Claude/Anthropic/GPT/LLM tokens, tags, and famous jailbreak markers). - Blocks attacks attempting to poison session context via `<claude_*>`, `<|im_start|>`, `[INST]`, `GODMODE`, `DAN`, `JAILBREAK`, leetspeak variants, and similar prompts. - Expanded detection coverage for real-world prompt injection and context poisoning exploits. - New documentation: Added SECURITY.md and blog post explaining new defenses. - Updated and reorganized SKILL.md for improved clarity.

v2.3.0

Fix: clarify loopback vs webhook mode

v2.2.1

v2.2.1: Enhanced README with threat scenarios, changelog, version badges

v2.2.0

v2.2: Secret protection (blocks token/config requests in EN/KO/JA/ZH), security audit script, infrastructure hardening guide, SSH/gateway/browser security checks

v2.1.0

v2.1: Full English documentation, improved config examples, comprehensive testing guide

v2.0.0

v2.0: Multi-language support (KO/JA/ZH), severity scoring, homoglyph detection, rate limiting, security log analyzer, configurable sensitivity

v1.0.0

Initial release: prompt injection defense for group chats

元数据

Slug prompt-guard

版本 3.6.2

许可证 —

累计安装 420

当前安装数 117

历史版本数 17

常见问题

Prompt Guard 是什么？

650+ pattern AI agent security defense covering prompt injection, supply chain injection, memory poisoning, action gate bypass, unicode steganography, cascad... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 12563 次。

如何安装 Prompt Guard？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install prompt-guard」即可一键安装，无需额外配置。

Prompt Guard 是免费的吗？

是的，Prompt Guard 完全免费（开源免费），可自由下载、安装和使用。

Prompt Guard 支持哪些平台？

Prompt Guard 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Prompt Guard？

由 seojoonkim（@seojoonkim）开发并维护，当前版本 v3.6.2。