← 返回 Skills 市场
Anti-Injection-Skill
作者
Wesley Armando
· GitHub ↗
· v2.0.3
10256
总下载
10
收藏
21
当前安装
7
版本数
在 OpenClaw 中安装
/install security-sentinel-skill
功能描述
Detect prompt injection, jailbreak, role-hijack, and system extraction attempts. Applies multi-layer defense with semantic analysis and penalty scoring.
安全使用建议
Install only after reviewing the artifacts and configuration. Prefer ClawHub/manual installation over running install.sh, avoid mutable GitHub-main downloads, keep API/translation/webhook/threat-feed features disabled unless explicitly needed, redact or limit audit logs, and require human approval for lockdown, tool disabling, or any sensitive-file monitoring.
功能分析
Type: OpenClaw Skill
Name: security-sentinel-skill
Version: 2.0.3
The OpenClaw AgentSkills skill bundle, 'security-sentinel', is a defensive security tool designed to detect and block various AI agent attacks, including prompt injection, jailbreaks, system prompt extraction, credential theft, and data exfiltration. All files, including the core SKILL.md and numerous reference markdown files (e.g., advanced-jailbreak-techniques.md, credential-exfiltration-defense.md), consistently describe malicious patterns and behaviors as *threats to be detected and blocked*, not as actions to be performed by the skill itself. The install.sh script is a standard installer for such a tool, downloading files from a specified GitHub repository and installing legitimate Python dependencies for NLP tasks. The SECURITY.md file explicitly clarifies that the skill's patterns for sensitive paths (e.g., ~/.aws/credentials) are for *detection* purposes only, and the skill itself never accesses these paths. There is no evidence of intentional harmful behavior by the skill, such as unauthorized data exfiltration, persistence, or remote control; its 'prompt injection' instructions are defensive, guiding the agent to prioritize security checks.
能力评估
Purpose & Capability
The core purpose is coherent with prompt-injection and jailbreak defense, and most offensive strings are presented as patterns to detect. However the artifacts also broaden into credential theft response, sensitive file monitoring, command-history analysis, tool lockdown, external translation, Telegram/webhook alerting, and remote threat-feed updates, which exceeds the narrower manifest summary.
Instruction Scope
The skill instructs agents to run before every user input, tool output, planning step, and tool execution, to wrap tools, sanitize outputs, log broadly, and enter lockdown. That authority is security-related but high-impact and not tightly scoped to user-directed or reversible controls.
Install Mechanism
The optional installer downloads mutable files from GitHub main, writes into /workspace, installs Python packages including with a --break-system-packages fallback, and has an uninstall path that recursively deletes an environment-controlled INSTALL_DIR without confirmation or path validation.
Credentials
Local analysis is claimed as the default, but the documentation and examples include API embeddings, googletrans translation of raw text, Telegram alerts, optional webhooks, persistent audit logs, and threat-feed synchronization. These are plausible for a security product but need clearer data-flow disclosure and opt-in boundaries.
Persistence & Privilege
There is no evidence of stealth persistence or a backdoor, but the skill recommends persistent AUDIT.md and metrics logging, broad monitoring, sensitive-file access callbacks, and emergency disabling of shell-like tools. Those controls are powerful enough to require careful review.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install security-sentinel-skill - 安装完成后,直接呼叫该 Skill 的名称或使用
/security-sentinel-skill触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v2.0.3
Security Sentinel 2.0.3 Changelog
- Added CONFIGURATION.md with setup and usage information.
- Added SECURITY.md to document security policies and vulnerability disclosure procedures.
v2.0.2
v2.0.2 · 18/02/2026
**Major upgrade introducing advanced jailbreak detection:**
- Added support for detecting advanced jailbreak techniques and multi-stage attacks (see `advanced-jailbreak-techniques.md`).
- Expanded defense against role-play, emotional manipulation, semantic paraphrasing, poetry/creative format attacks, many-shot jailbreaking, adversarial suffixes, and more.
- Security model enhanced to cover RAG poisoning, credential theft, data exfiltration, and indirect injections via various input modalities.
- Updated SKILL.md with all new detection vectors, improved technical detail, and broader threat coverage.
- Version bumped to 2.0.2.
v2.0.1
security-sentinel-skill 2.0.1 changelog:
- Added comprehensive documentation of advanced jailbreak techniques as a new file: `advanced-jailbreak-techniques-v2.md`.
- No changes to code or detection logic; update is documentation-only.
v2.0.0
**Major upgrade introducing advanced jailbreak detection:**
- Added support for detecting advanced jailbreak techniques and multi-stage attacks (see `advanced-jailbreak-techniques.md`).
- Expanded defense against role-play, emotional manipulation, semantic paraphrasing, poetry/creative format attacks, many-shot jailbreaking, adversarial suffixes, and more.
- Security model enhanced to cover RAG poisoning, credential theft, data exfiltration, and indirect injections via various input modalities.
- Updated SKILL.md with all new detection vectors, improved technical detail, and broader threat coverage.
- Version bumped to 2.0.0.
v1.1.1
No changes detected in this version.
- Version 1.1.1 contains no updates; all files and documentation remain identical to the previous release.
v1.1.0
v1.1.0 - Advanced Threats Update
NEW:
- 350 new patterns covering 2024-2026 threats
- Indirect injection (emails, webpages, documents)
- Memory persistence (spAIware, time-shifted attacks)
- Credential theft (ClawHavoc, Atomic Stealer)
- 3 new reference files with detailed documentation
COVERAGE:
- 98% → 98.5% of documented threats
- 347 → 697 core patterns
- 3,850+ total patterns
Based on real-world ClawHavoc campaign ($2.4M stolen).
v1.0.0
Initial release of Security Sentinel: Multi-layer defense for detecting and blocking prompt injection, jailbreaks, and system extraction.
- Detects prompt injection, jailbreak, system prompt extraction, role hijacking, and configuration dump attempts.
- Applies blacklist pattern matching, semantic analysis (intent classification), and evasion tactic detection.
- Introduces a penalty scoring system with automatic recovery and escalating defense modes (Normal, Warning, Alert, Lockdown).
- Supports pre-tool-execution input validation and post-output sanitization to prevent leaks.
- Outputs structured allow/block decisions with detailed reasoning and audit log integration.
- Provides Telegram alert integration for critical security events.
元数据
常见问题
Anti-Injection-Skill 是什么?
Detect prompt injection, jailbreak, role-hijack, and system extraction attempts. Applies multi-layer defense with semantic analysis and penalty scoring. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 10256 次。
如何安装 Anti-Injection-Skill?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install security-sentinel-skill」即可一键安装,无需额外配置。
Anti-Injection-Skill 是免费的吗?
是的,Anti-Injection-Skill 完全免费(开源免费),可自由下载、安装和使用。
Anti-Injection-Skill 支持哪些平台?
Anti-Injection-Skill 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Anti-Injection-Skill?
由 Wesley Armando(@georges91560)开发并维护,当前版本 v2.0.3。
推荐 Skills