← Back to Skills Marketplace
Vigilance
by
sanjeet-toosi
· GitHub ↗
· v1.0.3
· MIT-0
128
Downloads
0
Stars
1
Active Installs
4
Versions
Install in OpenClaw
/install vigilance
Description
Evaluate-before-Execute (EBE) guardrail for OpenClaw agents. Issues a mandatory GO / NO-GO decision before any high-stakes tool call. Enforces child-safety p...
Usage Guidance
This package contains two related but distinct evaluator scripts and inconsistent metadata/paths — verify which eval_engine.py your agent will run and where you must place SENTINEL_CONFIG.md. Before installing: (1) Inspect both eval_engine.py files yourself or run them in an isolated environment; (2) be aware that when LLM judge mode is enabled the exact --data you pass (commands, URLs, booking details) will be sent to external provider APIs if you set ANTHROPIC_API_KEY or OPENAI_API_KEY — avoid including secrets or sensitive personal data in --data; (3) confirm which file paths the agent expects and fix the SKILL.md/paths if needed; (4) consider disabling LLM judgment (use rule-based fallback) if you cannot accept third‑party data transmission; and (5) if you rely on Chain-of-Thought remaining private, ensure stderr is not forwarded to external logs — CoT is printed to stderr by design. These issues look like sloppy packaging and disclosure rather than clearly malicious code, but they merit manual review before trusting the skill in production.
Capability Analysis
Type: OpenClaw Skill
Name: vigilance
Version: 1.0.3
The bundle provides two defensive tools for OpenClaw agents: 'agent-sentinel' (a safety guardrail) and 'agent-eval-engine' (a quality scorer). Both tools use rule-based regex checks and LLM-as-judge calls (via Anthropic, OpenAI, or Ollama) to evaluate agent actions and outputs for safety, compliance, and accuracy. The SKILL.md files contain strict instructions to ensure the agent invokes these evaluators before executing high-stakes tools like shell commands or payments. No evidence of data exfiltration, malicious execution, or persistence was found; the code is transparent, well-documented, and aligned with its stated purpose of providing safety and quality gates.
Capability Assessment
Purpose & Capability
The top-level SKILL.md and many files describe an Evaluate-before-Execute (agent-sentinel) guardrail, which reasonably requires a local Python script and optional LLM keys. However, the package also contains a second, distinct skill (agent-eval-engine / quality scorer) with its own SKILL.md and eval_engine.py. Filenames, in-repo paths, and example invocation paths in docs differ (e.g. ~/.openclaw/skills/agent-sentinel/... vs ~/.openclaw/skills/agent-eval-engine/...), and registry metadata versions/_meta.json entries are inconsistent. This duplication and naming mismatch is unexpected and could confuse which script the agent will actually call.
Instruction Scope
The sentinel SKILL.md correctly requires running eval_engine.py before certain tool calls and defines JSON stdout parsing. But it (a) instructs callers not to parse stderr even though the script emits Chain-of-Thought to stderr — this can leak internal reasoning if stderr is captured by logs; (b) asks you to pass the exact payload (--data) including URLs, commands, and amounts which the code will include in LLM prompts; and (c) references explicit file paths that may not match the actual layout in the package. The combination means sensitive payloads (commands, URLs, personal data) may be transmitted to third-party LLMs unless you explicitly avoid using them.
Install Mechanism
There is no automated install spec (no download/installer), which lowers risk — it is files + requirements.txt and expects users to pip-install dependencies. No remote download URLs or extract steps are present. The requirements list only standard LLM SDKs (anthropic, openai) and python-dotenv. This is moderate risk only insofar as installing provider SDKs enables sending data to external APIs.
Credentials
Registry metadata declares no required env vars, but both SKILL.md and the Python code expect ANTHROPIC_API_KEY and/or OPENAI_API_KEY (and optionally OLLAMA_HOST). That mismatch means the registry underreports required credentials. Requesting LLM API keys is proportionate for an 'LLM-as-judge' feature, but you must be aware that the tool will (by design) send user intents and --data payloads to those providers — potentially exposing sensitive content to third parties.
Persistence & Privilege
The skill does not request 'always: true' and is user-invocable (default). There is no evidence it attempts to modify other skills or system-wide settings. The override protocol references logging overrides, but the provided artifacts do not show any system-wide persistence or privilege escalation beyond standard file reads of SENTINEL_CONFIG.md.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install vigilance - After installation, invoke the skill by name or use
/vigilance - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.3
**Major update: Initial release of the agent-sentinel skill (Evaluate-before-Execute guardrail for OpenClaw agents).**
- Introduces a mandatory evaluation layer ("agent-sentinel") enforcing safety and compliance before any high-stakes tool call.
- Implements strict GO / NO-GO decisions (ALLOW, BLOCK, ADVISE) for critical actions—booking, payment, web search, and shell commands—based on user and safety policy specified in SENTINEL_CONFIG.md.
- Provides a clear command-line interface and response schema; emits structured JSON decisions with severity and suggested alternatives.
- Enforces robust child-safety, budget, and travel preferences via a configurable policy file.
- Includes explicit user override and advisory protocols for handling preference and policy violations.
- Complete documentation of configuration, invocation, and decision handling in the new SKILL.md.
v1.0.2
**agent-eval-engine v1.1.0**
- Bumped version from 1.0.0 to 1.1.0.
- Documentation improved in SKILL.md for clarity and usability.
- No functional changes; API and invocation remain the same.
v1.0.1
- Added _meta.json file for skill metadata.
- Updated eval_engine.py (details not shown).
- No user-facing changes to documentation or usage instructions.
v1.0.0
agent-eval-engine 1.0.0 — initial release
- Provides an objective quality-control evaluator for AI agent outputs.
- Scores responses 0–100 across six dimensions: Safety, Accuracy, Compliance, Intent Alignment, Transparency, and Latency.
- Returns a structured JSON report and a readable Markdown summary.
- Supports API-based evaluation for some criteria (Anthropic/OpenAI) and works partially rule-based without API keys.
- Includes clear invocation, parsing, and rendering instructions.
- Offers configurable quality and latency thresholds via environment variables.
Metadata
Frequently Asked Questions
What is Vigilance?
Evaluate-before-Execute (EBE) guardrail for OpenClaw agents. Issues a mandatory GO / NO-GO decision before any high-stakes tool call. Enforces child-safety p... It is an AI Agent Skill for Claude Code / OpenClaw, with 128 downloads so far.
How do I install Vigilance?
Run "/install vigilance" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Vigilance free?
Yes, Vigilance is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does Vigilance support?
Vigilance is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Vigilance?
It is built and maintained by sanjeet-toosi (@sanjeet-toosi); the current version is v1.0.3.
More Skills