Description

Self-evolving security system for agent skills enforcing risk assessment, audit logging, tiered approvals, and continuous rule updates on all skill commands.

README (SKILL.md)

Little Steve Agent Guard

Name: Little Steve Agent Guard
Author: echoofzion

A self-evolving security system for agent skills. Wraps all skill command execution with risk assessment, audit logging, tiered approval, and continuous rule learning.

Dependencies

jq (required) — install via brew install jq or apt install jq

Filesystem Scope

This is a cross-skill security guard. By design, it needs read access to other skills' directories to:

guard-exec.sh: read target scripts for static risk analysis before execution
capability-diff.sh: compare a skill's SKILL.md declarations against its actual scripts

The guard does not write to other skills' directories. All writes (audit logs, rules) stay within its own reports/ and rules/ directories.

Bypass & Emergency Procedures

The runbook (docs/runbook.md) documents emergency bypass procedures (circuit-break, manual script execution, log reset). These are human-operator-only actions for when the guard itself malfunctions. The agent must never execute bypass procedures autonomously.

CRITICAL: Execution Rule

ALL skill script executions MUST go through guard-exec.sh. Never call skill scripts directly. Always use:

bash {baseDir}/scripts/guard-exec.sh exec \x3Cscript-path> [args...]

Example:

bash {baseDir}/scripts/guard-exec.sh exec {workspaceDir}/skills/\x3Cother-skill>/scripts/\x3Cscript>.sh \x3Ccommand> [args...]

Approval Levels

L1 (low/medium risk): Auto-execute, audit logged
L2 (dry-run): Preview without executing
L3 (high risk): Block and prompt user — output warning, wait for user to reply "确认" or "confirm"
BLOCK (critical): Reject entirely, no execution possible

When guard-exec.sh returns exit code 10 (prompt), present the warning to the user and wait for confirmation. On "确认"/"confirm", re-run with confirm instead of exec.

Agent Command Conventions

Execute a skill command (with guard)

bash {baseDir}/scripts/guard-exec.sh exec \x3Cscript> [args...]

Confirm a prompted action (after user approval)

bash {baseDir}/scripts/guard-exec.sh confirm \x3Cscript> [args...]

Preview without executing

bash {baseDir}/scripts/guard-exec.sh dry-run \x3Cscript> [args...]

Quick risk check

bash {baseDir}/scripts/guard-exec.sh check \x3Cscript> [args...]

Run capability consistency check on a skill

bash {baseDir}/scripts/capability-diff.sh check --skill-dir \x3Cskill-path>

View audit stats

bash {baseDir}/scripts/audit.sh stats

Generate weekly security report

bash {baseDir}/scripts/weekly-report.sh generate [days]

Manage rules

bash {baseDir}/scripts/promote-rule.sh list
bash {baseDir}/scripts/promote-rule.sh add --rule \x3Cname> --pattern \x3Cregex> --level \x3Clow|medium|high|critical>
bash {baseDir}/scripts/promote-rule.sh promote --rule \x3Cname>
bash {baseDir}/scripts/promote-rule.sh demote --rule \x3Cname>

Test candidate rules against history

bash {baseDir}/scripts/replay-verify.sh test --rule \x3Cname>
bash {baseDir}/scripts/replay-verify.sh test-all

Five Core Security Policies (Immutable)

Least Privilege — scripts only access their own data directory
Credential Protection — no secrets in args, output, or logs
Capability Consistency — runtime must match SKILL.md declarations
Outbound Control — no undeclared network access
High-Risk Confirmation — destructive/critical actions need human approval

Risk Classification

Level	Examples
low	read-only: list, view, status check
medium	single-item mutation: add, update status
high	delete, bulk mutation, file write outside data/
critical	network access, secret exposure, system commands

Data Files

reports/audit-events.jsonl — audit log (auto-created)
reports/failure-dataset.json — failure samples for evolution
rules/active/*.rule — active custom rules
rules/candidates/*.rule — candidate rules pending promotion

小史安全卫士

面向 Agent Skill 的自进化安全系统。为所有技能命令提供风险评估、审计日志、分级审批和持续规则进化。

依赖

jq（必须）— 通过 brew install jq 或 apt install jq 安装

文件系统范围

这是一个跨技能安全卫士。按设计，它需要读取其他技能目录的权限：

guard-exec.sh：执行前读取目标脚本做静态风险分析
capability-diff.sh：对比技能的 SKILL.md 声明与实际脚本行为

卫士不会写入其他技能的目录。所有写入（审计日志、规则）都在自身的 reports/ 和 rules/ 目录内。

绕过与紧急操作

运行手册（docs/runbook.md）记录了紧急绕过操作（熔断、直接执行脚本、日志重置）。这些是仅限人工操作员的紧急措施，用于卫士本身出故障的情况。Agent 绝对不可以自主执行绕过操作。

关键规则：执行约束

所有技能脚本执行必须通过 guard-exec.sh。 不要直接调用技能脚本，始终使用：

bash {baseDir}/scripts/guard-exec.sh exec \x3C脚本路径> [参数...]

审批分级

L1（低/中风险）：自动执行，记录审计日志
L2（预览）：只预览不执行
L3（高风险）：阻断并提示用户——显示警告，等待用户回复"确认"
阻断（严重）：直接拒绝，无法执行

当 guard-exec.sh 返回退出码 10（提示）时，向用户展示警告并等待确认。用户回复"确认"后，用 confirm 替代 exec 重新执行。

Agent 执行约定

执行技能命令（带防护）

bash {baseDir}/scripts/guard-exec.sh exec \x3C脚本> [参数...]

确认被提示的操作（用户批准后）

bash {baseDir}/scripts/guard-exec.sh confirm \x3C脚本> [参数...]

预览不执行

bash {baseDir}/scripts/guard-exec.sh dry-run \x3C脚本> [参数...]

快速风险检查

bash {baseDir}/scripts/guard-exec.sh check \x3C脚本> [参数...]

对技能做声明-行为一致性检查

bash {baseDir}/scripts/capability-diff.sh check --skill-dir \x3C技能路径>

查看审计统计

bash {baseDir}/scripts/audit.sh stats

生成周报

bash {baseDir}/scripts/weekly-report.sh generate [天数]

管理规则

bash {baseDir}/scripts/promote-rule.sh list
bash {baseDir}/scripts/promote-rule.sh add --rule \x3C名称> --pattern \x3C正则> --level \x3Clow|medium|high|critical>
bash {baseDir}/scripts/promote-rule.sh promote --rule \x3C名称>
bash {baseDir}/scripts/promote-rule.sh demote --rule \x3C名称>

测试候选规则

bash {baseDir}/scripts/replay-verify.sh test --rule \x3C名称>
bash {baseDir}/scripts/replay-verify.sh test-all

五条核心安全策略（不可变）

最小权限 — 脚本只能访问自身数据目录
凭证保护 — 参数、输出、日志中不得出现密钥
能力一致性 — 运行时行为必须与 SKILL.md 声明一致
外发控制 — 不得有未声明的网络访问
高风险确认 — 破坏性/严重操作需人工审批

风险分级

级别	示例
low	只读操作：列表、查看、状态检查
medium	单项变更：新增、更新状态
high	删除、批量变更、数据目录外写文件
critical	网络访问、密钥暴露、系统命令

数据文件

reports/audit-events.jsonl — 审计日志（自动创建）
reports/failure-dataset.json — 失败样本（用于进化）
rules/active/*.rule — 活跃自定义规则
rules/candidates/*.rule — 候选规则（待晋升）

Usage Guidance

This skill implements a plausible cross-skill guard, but it depends on conventions and static checks rather than true sandboxing. Before installing: - Confirm you trust the author/source and resolve the registry metadata mismatch (jq should be declared). - Do NOT rely on the guard alone to prevent writes or exfiltration — run it in a restricted environment (container, chroot, or with OS-level file permissions / SELinux/AppArmor) so executed scripts cannot escape their allowed scope. - Audit the included scripts (guard-exec.sh, promote-rule.sh, replay-verify.sh) to ensure you understand how rules are added/promoted and how prompts are emitted. - Treat the runbook bypass commands as sensitive: restrict who/what can run them (do not give the agent permission to run promote/demote/mv or direct execution commands). - Test in a staging environment: create sample skills with dynamic-eval patterns to see whether the guard detects them and whether false negatives exist. If you cannot apply OS-level containment or if you need cryptographic enforcement of least privilege, consider a guard that integrates with kernel-level sandboxing or a platform that enforces execution constraints, because this guard enforces policy mostly via static inspection and human approvals.

Capability Analysis

Type: OpenClaw Skill Name: little-steve-agent-guard Version: 0.1.4 The 'little-steve-agent-guard' bundle is a comprehensive security framework designed to monitor and gate the execution of other agent skills. It implements a risk-assessment wrapper (guard-exec.sh) that performs static analysis for network binaries, system-level commands, and secret patterns before execution. It includes robust audit logging with automated secret redaction (audit.sh), a capability consistency checker (capability-diff.sh), and a rule evolution system. While it requires broad read access to other skill directories, this is functionally necessary for its stated purpose of cross-skill security auditing, and the logic is strictly defensive with no evidence of malicious intent or data exfiltration.

Capability Assessment

ℹ Purpose & Capability

The name/description (cross-skill guard) align with the scripts: the guard intentionally reads other skills' scripts, compares SKILL.md vs runtime behavior, audits executions, and manages detection rules. Declaring jq in SKILL.md is coherent. However the guard's claims that it 'does not write to other skills' directories' and that core policies are 'enforced' rely on static checks and convention rather than strong sandboxing — executing untrusted scripts directly can still allow writes or network access if the static checks miss dynamic behavior.

⚠ Instruction Scope

SKILL.md explicitly requires all skill script execution to be routed through guard-exec.sh and documents human-only bypass procedures. The runbook and scripts (guard-exec.sh, capability-diff.sh) read other skill directories — expected — but the runbook also documents commands to bypass the guard (mv of rules, direct execution of other skills' scripts) and to reset logs. Those bypass commands are marked human-only but are present and executable; if an agent were allowed to run them (or if a malicious or buggy rule is promoted) protections could be disabled. The guard relies on static pattern checks/args inspection to classify risk; dynamic behaviors inside scripts (eval, base64 decode, runtime interpreters, /dev/tcp, shell -c etc.) could evade detection, and the guard executes the target scripts directly (not sandboxed).

✓ Install Mechanism

No install spec or external downloads — instruction-only with included shell scripts. No network-based installation or archive extraction detected. Risk from install mechanism is low.

✓ Credentials

No environment variables or external credentials requested. SKILL.md declares a single binary dependency (jq), which the code uses. No unexplained credential or system config access is requested.

ℹ Persistence & Privilege

The skill is not forced into always-on (always:false) and model invocation is allowed (normal). However it is a cross-skill tool that, if invoked autonomously, can inspect and gate other skills. Combined with the presence of human-only bypass/runbook commands and rule promotion mechanics, autonomous invocation increases blast radius if misused or if a promoted rule is malicious. The package itself does not attempt to modify other skills' configs, but it requires read access to other skill directories and executes their scripts.

Version History

v0.1.4

Add path whitelist and input validation to rule operations (blocks path traversal and injection)

v0.1.3

Fix capability mismatch: replace inbox.sh example reference with generic placeholder

v0.1.2

Addresses remaining security scan feedback: - Explicit dependency section (jq) in SKILL.md body - Document filesystem scope: reads other skills for analysis, writes only own dirs - Clarify bypass procedures are human-operator-only, never autonomous

v0.1.1

Security hardening based on ClawHub scan feedback: - audit-events.jsonl now created with chmod 600 - Expanded secret redaction (Bearer, AWS keys, GitHub tokens, JWTs, base64, hex) - Detect dynamic execution: python/perl/ruby/node -c, /dev/tcp, base64 decode, eval, sh -c - Rule promotion now requires explicit human confirmation (--confirmed flag) - Runbook adds security disclaimer about emergency bypass

v0.1.0

Initial release of Little Steve Agent Guard: a self-evolving security wrapper for agent skills. - Wraps all skill commands with risk assessment, audit logging, tiered approval, and evolving security rules. - Enforces all skill executions via guard-exec.sh, never direct script calls. - Provides multi-level risk classification and mandate for human confirmation on high-risk actions. - Includes tools for audit stats, capability consistency checks, rule management, and security reporting. - Immutable application of five core security policies such as least privilege, credential protection, and outbound control.

Metadata

Slug little-steve-agent-guard

Version 0.1.4

License —

All-time Installs 2

Active Installs 2

Total Versions 5

Frequently Asked Questions

What is Little Steve Agent Guard?

Self-evolving security system for agent skills enforcing risk assessment, audit logging, tiered approvals, and continuous rule updates on all skill commands. It is an AI Agent Skill for Claude Code / OpenClaw, with 334 downloads so far.

How do I install Little Steve Agent Guard?

Run "/install little-steve-agent-guard" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Little Steve Agent Guard free?

Yes, Little Steve Agent Guard is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Little Steve Agent Guard support?

Little Steve Agent Guard is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Little Steve Agent Guard?

It is built and maintained by EchoOfZion (@echoofzion); the current version is v0.1.4.

More Skills

Little Steve Agent Guard

Little Steve Agent Guard

Dependencies

Filesystem Scope

Bypass & Emergency Procedures

CRITICAL: Execution Rule

Approval Levels

Agent Command Conventions

Five Core Security Policies (Immutable)

Risk Classification

Data Files

小史安全卫士

依赖

文件系统范围

绕过与紧急操作

关键规则：执行约束

审批分级

Agent 执行约定

五条核心安全策略（不可变）

风险分级

数据文件

What is Little Steve Agent Guard?

How do I install Little Steve Agent Guard?

Is Little Steve Agent Guard free?

Which platforms does Little Steve Agent Guard support?

Who created Little Steve Agent Guard?

💬 Comments