Prompt Shield Publish

Name: Prompt Shield Publish
Author: stlas

功能描述

Prompt Injection Firewall for AI agents. 113 detection patterns, 14 threat categories, zero dependencies. Protects against fake authority, command injection, memory poisoning, skill malware, crypto spam, and more. Hash-chain tamper-proof whitelist with mandatory peer review. Claude Code hook integration.

安全使用建议

This package appears to implement a local prompt-scanner and hook for Claude, but exercise caution before enabling it: - Provenance: the registry lists the source as 'unknown' even though SKILL.md references a GitHub repo. Verify the upstream project and author (download from a trusted repo or vendor) before installing. - Dependencies: despite the claim of "zero dependencies," shield.py requires Python3 and PyYAML. Install those from trusted package sources and inspect installed packages. - Whitelist / peer review: the hash-chain whitelist exists, but the "peer review" protections are enforced only via names in the YAML (strings). There is no cryptographic identity or external approval service in the shipped code, so a local attacker or misconfigured workflow could add approvals or edit the file. Do not enable the whitelist or give it trust until you understand the approval workflow. - File writes: the tool writes whitelist.yaml and whitelist-audit.log in its directory. Consider file permissions, location (use an isolated path), and backups before enabling. - Hook integration: adding the hook requires editing ~/.claude/settings.json; verify what your Claude client does with hook exit codes. Test the scanner locally first (dry-run on benign data) to evaluate false positives and behaviour. - Review for network behavior: the included files show no explicit network calls, but you should search the full shield.py (especially truncated parts) for any HTTP/exec calls before trusting it in production. If you want to proceed: run the code in an isolated environment, inspect the full shield.py for external calls, install PyYAML from a trusted source, and only enable the Claude hook after confirming behaviour and whitelist governance. If you want me to, I can re-scan the remaining truncated parts of shield.py for network or obfuscated behavior (upload the full file text).

功能分析

Type: OpenClaw Skill Name: prompt-shield Version: 3.0.6 This skill bundle, 'prompt-shield', is a security tool designed to act as a Prompt Injection Firewall for AI agents. All files (SKILL.md, prompt-shield-hook.sh, shield.py, patterns.yaml, whitelist.yaml, SCORING.md) consistently describe and implement a defensive mechanism. The `shield.py` script contains logic for pattern matching, heuristic scoring, and a hash-chain-based whitelist to detect and mitigate various attack types (e.g., command injection, reverse shells, data exfiltration attempts, memory poisoning). The `patterns.yaml` file defines the signatures of these malicious activities that the tool *detects*, not performs. The `prompt-shield-hook.sh` integrates this scanner into an agent's input pipeline to block or warn about malicious input. There is no evidence of intentional harmful behavior, data exfiltration, unauthorized execution, or prompt injection *by* this skill against the agent. The design explicitly includes safeguards against false positives, reinforcing its defensive intent.

能力评估

ℹ Purpose & Capability

The files (shield.py, patterns.yaml, whitelist.yaml, hook) implement a local prompt-scanner/whitelist as described. However the top-level description claims "zero dependencies" while SKILL.md and shield.py require Python3 + PyYAML. SKILL.md references a GitHub repo (https://github.com/stlas/PromptShield) but the skill source is marked 'unknown' in registry metadata — this mismatch reduces provenance/trust. Overall the required artifacts (pattern DB, CLI, Claude hook) are coherent with the stated purpose, but the provenance and dependency claim are inconsistent.

ℹ Instruction Scope

Runtime instructions and the provided hook are narrowly scoped: they read input (stdin/JSON) and run local pattern scans, then exit with codes/messages to let Claude accept/warn/block. The skill does not request or read arbitrary system environment variables. It does read and write local files (whitelist.yaml, whitelist-audit.log) in its directory. SKILL.md promotes integrating the hook into ~/.claude/settings.json (requires user edit). Documentation mentions external mechanisms (e.g., SYNAPSE peer approval) and 'peer review' processes that are not implemented or cryptographically enforced in the supplied code — the approval model is just string entries in the YAML, not authenticated peers. That inconsistency could give a false sense of protection.

ℹ Install Mechanism

There is no install spec (instruction-only) but implementation files are included. This means installing is a manual file placement and running shield.py locally. There are no network downloads or packaged installers in the manifest, which lowers supply-chain risk, but the code requires PyYAML (pip). The absence of an automated install step is coherent but the skill's description saying 'zero dependencies' contradicts the actual dependency on PyYAML.

✓ Credentials

The skill asks for no environment variables, no credentials, and no config paths outside its own directory. It writes its own whitelist and audit log files locally. That level of access is proportionate for a local prompt-scanner.

✓ Persistence & Privilege

always:false (default) and disable-model-invocation:false are set — normal for skills. The skill does not request system-wide privileges or modify other skills. It does create/modify whitelist.yaml and whitelist-audit.log in its own directory if used; enabling the hook requires editing the user's Claude settings.json (a manual step by the user).

版本历史

v3.0.6

Audit fixes: version consistency, dependencies corrected (PyYAML), test_shield.py removed from package, SKILL.md metadata updated

v3.0.5

Fix OpenClaw scanner crash: Removed inline fallback patterns from shield.py. All patterns now exclusively loaded from patterns.yaml. QA passed: 29/29 core tests, 73/135 GUARDIAN tests, no regression.

v3.0.4

Fix metadata structure: moved bins from openclaw to requires section for OpenClaw compatibility.

v3.0.3

Clean release: Removed test_shield.py with embedded attack samples to pass security scanners. Runtime files only.

v3.0.2

Initial ClawHub release. 113 patterns, 14 categories, hash-chain whitelist v2, Claude Code hook, zero dependencies.

元数据

Slug prompt-shield

版本 3.0.6

许可证 —

累计安装 2

当前安装数 2

历史版本数 5

常见问题

Prompt Shield Publish 是什么？

Prompt Injection Firewall for AI agents. 113 detection patterns, 14 threat categories, zero dependencies. Protects against fake authority, command injection, memory poisoning, skill malware, crypto spam, and more. Hash-chain tamper-proof whitelist with mandatory peer review. Claude Code hook integration. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 1207 次。

如何安装 Prompt Shield Publish？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install prompt-shield」即可一键安装，无需额外配置。

Prompt Shield Publish 是免费的吗？

是的，Prompt Shield Publish 完全免费（开源免费），可自由下载、安装和使用。

Prompt Shield Publish 支持哪些平台？

Prompt Shield Publish 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Prompt Shield Publish？

由 stlas（@stlas）开发并维护，当前版本 v3.0.6。

Prompt Shield Publish 是什么？

如何安装 Prompt Shield Publish？

Prompt Shield Publish 是免费的吗？

Prompt Shield Publish 支持哪些平台？

谁开发了 Prompt Shield Publish？

💬 留言讨论