← Back to Skills Marketplace
Prompt Guard
by
seojoonkim
· GitHub ↗
· v3.6.2
12563
Downloads
56
Stars
117
Active Installs
17
Versions
Install in OpenClaw
/install prompt-guard
Description
650+ pattern AI agent security defense covering prompt injection, supply chain injection, memory poisoning, action gate bypass, unicode steganography, cascad...
Usage Guidance
Install only if you are comfortable configuring it explicitly. For sensitive or offline environments, set PG_API_ENABLED=false and disable HiveFence auto-reporting before first use, and consider turning off message-content logging. Treat python3 -m prompt_guard.audit --fix as an administrative host-modification command, not a normal prompt scanner.
Capability Analysis
Type: OpenClaw Skill
Name: prompt-guard
Version: 3.6.2
The prompt-guard skill bundle is a comprehensive security library for AI agents, providing over 650 detection patterns for prompt injection, data exfiltration, and tool abuse. It features a multi-layered defense-in-depth architecture including text normalization, multi-encoding decoders (decoder.py), and an enterprise-grade DLP system (output.py) for redacting sensitive credentials. The bundle includes an optional API client (api_client.py) and a distributed threat intelligence client (hivefence.py) that report anonymized threat metadata (hashes) to external endpoints (pg-secure-api.vercel.app and hivefence-api.seojoon-kim.workers.dev). A system audit utility (audit.py) is also provided to check for common security misconfigurations. The code is well-structured, includes extensive regression tests, and demonstrates clear intent to protect agents rather than attack them.
Capability Assessment
Purpose & Capability
The core prompt-injection and DLP scanning behavior is purpose-aligned, but the package also initializes a remote pattern API by default, auto-reports HIGH+ detections to HiveFence by default, and includes Clawdbot host-audit utilities that inspect local and system configuration.
Instruction Scope
Documentation is inconsistent: prominent text claims 100% offline or optional/off-by-default API behavior, while SKILL.md, ARCHITECTURE.md, config.example.yaml, and engine.py show API-enabled-by-default behavior with a built-in beta key. HiveFence auto-reporting is also less clearly disclosed than the API reporting path.
Install Mechanism
No install-time execution hooks or destructive installation behavior were found. The Python package uses normal metadata, a CLI entry point, and a small dependency set.
Credentials
Default runtime behavior can contact external endpoints for pattern fetches and threat reports. The package can also write security logs, optionally include message previews, cache HiveFence data under ~/.clawdbot, and read ~/.clawdbot plus /etc/ssh/sshd_config through the audit utility.
Persistence & Privilege
Persistence is mostly limited to logs and caches, but the audit module has a user-invoked --fix path that changes filesystem permissions with os.chmod and no per-change confirmation after the flag is supplied.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install prompt-guard - After installation, invoke the skill by name or use
/prompt-guard - Provide required inputs per the skill's parameter spec and get structured output
Version History
v3.6.2
No code or documentation changes detected in this release.
- Version number updated from 3.6.0 to 3.6.2.
- No functional or documentation changes present.
v3.6.1
No changes in this release.
- Version number updated only.
- No file changes detected.
- No new features, fixes, or updates introduced.
v3.6.0
**v3.6.0 expands coverage to 650+ patterns with new ClawSecurity-aligned detections and attack categories.**
- Added 50+ new detection patterns, including ClawHavoc supply chain signatures, cloud credentials exfiltration, and code exfiltration defense.
- Introduced detection for multi-turn manipulation, authority escalation (e.g., emergency override, sudo grant), and PII output such as SSN and credit cards.
- Enhanced protection against config drift, large data dumps, SQL injection via tool parameters, and path traversal attacks.
- High and medium tiers expanded with new checks for financial data, cross-session attacks, and tool/agent parameter abuse.
- Pattern set updated: now covers prompt injection, supply chain threats, memory poisoning, unicode steganography, cascade amplification, and more across 12 security categories and 10 languages.
v3.5.0
v3.4.0/3.5.0: Typo-based evasion detection + TieredPatternLoader fix (PR #10 by @matthew-a-gordon). 14 new regression tests. Drop-in LLM prompt injection defense.
v3.4.0
v3.4.0: AI Recommendation Poisoning, Calendar Injection, PAP Social Engineering
v3.3.0
**v3.3.0 adds optional API support with early-access and premium pattern tiers.**
- Introduced API client for early-access and premium pattern updates (optional; uses built-in beta key, can be disabled).
- Bundled pattern set expanded to 577+; now includes advanced "skill weaponization" patterns for deeper threat coverage.
- Switchable between fully offline (no API requests) and API-enhanced detection via config or environment variable.
- Improved documentation: clarified API usage, pattern tiers, config, and CLI; security categories and feature set updated.
- New and updated tests for typo evasion and API behavior.
- Internal code updates in engine, scanner, and patterns to support API integration and expanded tier logic.
v3.1.0
Token Optimization: 70% reduction via tiered loading, 90% cache savings, SKILL.md 65% smaller
v2.6.1
# prompt-guard v2.6.1 Changelog
- Updated changelog and documentation.
- Minor adjustments in scripts/detect.py (details not specified in input).
v2.5.3
**prompt-guard v2.6.0 – Major update: HiveFence distributed threat intelligence and new real-world defenses**
- Integrated with HiveFence: agents share and receive new attack patterns via collective defense network.
- New CLI tools for reporting, voting, and syncing threat patterns with HiveFence.
- Additional defenses against social engineering, including single-approval expansion, credential path harvest, and security bypass coaching.
- Owner-only restrictions now enforced for sensitive commands in DMs as well as group chats.
- ARCHITECTURE.md added; major documentation updates.
v2.5.2
Moltbook attack collection: BRC-20 JSON injection, guardrail bypass, agent sovereignty manipulation, CALL TO ACTION detection
v2.5.1
prompt-guard v2.5.1
- Added critical detection for LLM system prompt mimicry (e.g. fake Claude/Anthropic/GPT/LLM tokens, tags, and famous jailbreak markers).
- Blocks attacks attempting to poison session context via `<claude_*>`, `<|im_start|>`, `[INST]`, `GODMODE`, `DAN`, `JAILBREAK`, leetspeak variants, and similar prompts.
- Expanded detection coverage for real-world prompt injection and context poisoning exploits.
- New documentation: Added SECURITY.md and blog post explaining new defenses.
- Updated and reorganized SKILL.md for improved clarity.
v2.3.0
Fix: clarify loopback vs webhook mode
v2.2.1
v2.2.1: Enhanced README with threat scenarios, changelog, version badges
v2.2.0
v2.2: Secret protection (blocks token/config requests in EN/KO/JA/ZH), security audit script, infrastructure hardening guide, SSH/gateway/browser security checks
v2.1.0
v2.1: Full English documentation, improved config examples, comprehensive testing guide
v2.0.0
v2.0: Multi-language support (KO/JA/ZH), severity scoring, homoglyph detection, rate limiting, security log analyzer, configurable sensitivity
v1.0.0
Initial release: prompt injection defense for group chats
Metadata
Frequently Asked Questions
What is Prompt Guard?
650+ pattern AI agent security defense covering prompt injection, supply chain injection, memory poisoning, action gate bypass, unicode steganography, cascad... It is an AI Agent Skill for Claude Code / OpenClaw, with 12563 downloads so far.
How do I install Prompt Guard?
Run "/install prompt-guard" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Prompt Guard free?
Yes, Prompt Guard is completely free (open-source). You can download, install and use it at no cost.
Which platforms does Prompt Guard support?
Prompt Guard is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Prompt Guard?
It is built and maintained by seojoonkim (@seojoonkim); the current version is v3.6.2.
More Skills