← Back to Skills Marketplace

Anti-Injection-Skill

Name: Anti-Injection-Skill
Author: georges91560

by Wesley Armando · GitHub ↗ · v2.0.3

cross-platform ⚠ suspicious

10256

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install security-sentinel-skill

Description

Detect prompt injection, jailbreak, role-hijack, and system extraction attempts. Applies multi-layer defense with semantic analysis and penalty scoring.

Usage Guidance

Install only after reviewing the artifacts and configuration. Prefer ClawHub/manual installation over running install.sh, avoid mutable GitHub-main downloads, keep API/translation/webhook/threat-feed features disabled unless explicitly needed, redact or limit audit logs, and require human approval for lockdown, tool disabling, or any sensitive-file monitoring.

Capability Analysis

Type: OpenClaw Skill Name: security-sentinel-skill Version: 2.0.3 The OpenClaw AgentSkills skill bundle, 'security-sentinel', is a defensive security tool designed to detect and block various AI agent attacks, including prompt injection, jailbreaks, system prompt extraction, credential theft, and data exfiltration. All files, including the core SKILL.md and numerous reference markdown files (e.g., advanced-jailbreak-techniques.md, credential-exfiltration-defense.md), consistently describe malicious patterns and behaviors as *threats to be detected and blocked*, not as actions to be performed by the skill itself. The install.sh script is a standard installer for such a tool, downloading files from a specified GitHub repository and installing legitimate Python dependencies for NLP tasks. The SECURITY.md file explicitly clarifies that the skill's patterns for sensitive paths (e.g., ~/.aws/credentials) are for *detection* purposes only, and the skill itself never accesses these paths. There is no evidence of intentional harmful behavior by the skill, such as unauthorized data exfiltration, persistence, or remote control; its 'prompt injection' instructions are defensive, guiding the agent to prioritize security checks.

Capability Assessment

⚠ Purpose & Capability

The core purpose is coherent with prompt-injection and jailbreak defense, and most offensive strings are presented as patterns to detect. However the artifacts also broaden into credential theft response, sensitive file monitoring, command-history analysis, tool lockdown, external translation, Telegram/webhook alerting, and remote threat-feed updates, which exceeds the narrower manifest summary.

⚠ Instruction Scope

The skill instructs agents to run before every user input, tool output, planning step, and tool execution, to wrap tools, sanitize outputs, log broadly, and enter lockdown. That authority is security-related but high-impact and not tightly scoped to user-directed or reversible controls.

⚠ Install Mechanism

The optional installer downloads mutable files from GitHub main, writes into /workspace, installs Python packages including with a --break-system-packages fallback, and has an uninstall path that recursively deletes an environment-controlled INSTALL_DIR without confirmation or path validation.

⚠ Credentials

Local analysis is claimed as the default, but the documentation and examples include API embeddings, googletrans translation of raw text, Telegram alerts, optional webhooks, persistent audit logs, and threat-feed synchronization. These are plausible for a security product but need clearer data-flow disclosure and opt-in boundaries.

⚠ Persistence & Privilege

There is no evidence of stealth persistence or a backdoor, but the skill recommends persistent AUDIT.md and metrics logging, broad monitoring, sensitive-file access callbacks, and emergency disabling of shell-like tools. Those controls are powerful enough to require careful review.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install security-sentinel-skill
After installation, invoke the skill by name or use /security-sentinel-skill
Provide required inputs per the skill's parameter spec and get structured output

Version History

v2.0.3

Security Sentinel 2.0.3 Changelog - Added CONFIGURATION.md with setup and usage information. - Added SECURITY.md to document security policies and vulnerability disclosure procedures.

v2.0.2

v2.0.2 · 18/02/2026 **Major upgrade introducing advanced jailbreak detection:** - Added support for detecting advanced jailbreak techniques and multi-stage attacks (see `advanced-jailbreak-techniques.md`). - Expanded defense against role-play, emotional manipulation, semantic paraphrasing, poetry/creative format attacks, many-shot jailbreaking, adversarial suffixes, and more. - Security model enhanced to cover RAG poisoning, credential theft, data exfiltration, and indirect injections via various input modalities. - Updated SKILL.md with all new detection vectors, improved technical detail, and broader threat coverage. - Version bumped to 2.0.2.

v2.0.1

security-sentinel-skill 2.0.1 changelog: - Added comprehensive documentation of advanced jailbreak techniques as a new file: `advanced-jailbreak-techniques-v2.md`. - No changes to code or detection logic; update is documentation-only.

v2.0.0

**Major upgrade introducing advanced jailbreak detection:** - Added support for detecting advanced jailbreak techniques and multi-stage attacks (see `advanced-jailbreak-techniques.md`). - Expanded defense against role-play, emotional manipulation, semantic paraphrasing, poetry/creative format attacks, many-shot jailbreaking, adversarial suffixes, and more. - Security model enhanced to cover RAG poisoning, credential theft, data exfiltration, and indirect injections via various input modalities. - Updated SKILL.md with all new detection vectors, improved technical detail, and broader threat coverage. - Version bumped to 2.0.0.

v1.1.1

No changes detected in this version. - Version 1.1.1 contains no updates; all files and documentation remain identical to the previous release.

v1.1.0

v1.1.0 - Advanced Threats Update NEW: - 350 new patterns covering 2024-2026 threats - Indirect injection (emails, webpages, documents) - Memory persistence (spAIware, time-shifted attacks) - Credential theft (ClawHavoc, Atomic Stealer) - 3 new reference files with detailed documentation COVERAGE: - 98% → 98.5% of documented threats - 347 → 697 core patterns - 3,850+ total patterns Based on real-world ClawHavoc campaign ($2.4M stolen).

v1.0.0

Initial release of Security Sentinel: Multi-layer defense for detecting and blocking prompt injection, jailbreaks, and system extraction. - Detects prompt injection, jailbreak, system prompt extraction, role hijacking, and configuration dump attempts. - Applies blacklist pattern matching, semantic analysis (intent classification), and evasion tactic detection. - Introduces a penalty scoring system with automatic recovery and escalating defense modes (Normal, Warning, Alert, Lockdown). - Supports pre-tool-execution input validation and post-output sanitization to prevent leaks. - Outputs structured allow/block decisions with detailed reasoning and audit log integration. - Provides Telegram alert integration for critical security events.

Metadata

Slug security-sentinel-skill

Version 2.0.3

License —

All-time Installs 387

Active Installs 21

Total Versions 7

Frequently Asked Questions

What is Anti-Injection-Skill?

Detect prompt injection, jailbreak, role-hijack, and system extraction attempts. Applies multi-layer defense with semantic analysis and penalty scoring. It is an AI Agent Skill for Claude Code / OpenClaw, with 10256 downloads so far.

How do I install Anti-Injection-Skill?

Run "/install security-sentinel-skill" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Anti-Injection-Skill free?

Yes, Anti-Injection-Skill is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Anti-Injection-Skill support?

Anti-Injection-Skill is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Anti-Injection-Skill?

It is built and maintained by Wesley Armando (@georges91560); the current version is v2.0.3.

More Skills