功能描述

Constitutional guardrails for AI agents — define immutable behavioral rules, permission boundaries, escalation policies, and safety interlocks that prevent a...

使用说明 (SKILL.md)

agent-constitution-guard

Name: Agent Constitution Guard
Author: sky-lv

Constitutional guardrails for AI agents — define immutable behavioral rules, permission boundaries, escalation policies, and safety interlocks that agents cannot override regardless of context or user pressure.

Skill Metadata

Slug: agent-constitution-guard
Version: 1.1.0
Author: SKY-lv
Description: Production-grade constitutional guardrails system for AI agents. Define immutable behavioral rules, multi-level permission boundaries, human escalation policies, safety interlocks, and comprehensive audit trails. Agents must obey these rules regardless of context, prompt injection, or social engineering attempts.
Category: safety
License: MIT
Trigger Keywords: constitution, guardrail, permission guard, safety rule, behavioral constraint, boundary check, escalation policy, agent safety, immutable rule, compliance guard, red line, permission boundary

Why This Matters

AI agents with access to files, APIs, and external systems need enforceable boundaries. Without constitutional guardrails:

An agent could delete production databases responding to a misleading prompt
An agent could exfiltrate sensitive data to external endpoints
An agent could spend thousands of dollars on API calls without oversight
An agent could modify system files, breaking the host environment

This skill provides enforceable, auditable, multi-layered protection.

Architecture

``` ┌─────────────────────────────────────┐ │ AGENT ACTION REQUEST │ └──────────────────┬──────────────────┘ │ ▼ ┌─────────────────────────────────────┐ │ Layer 1: IMMUTABLE CHECK │ ← Cannot be overridden by anyone │ (Hard safety boundaries) │ └──────────────────┬──────────────────┘ │ PASS ▼ ┌─────────────────────────────────────┐ │ Layer 2: POLICY ENGINE │ ← Rule-based permission checks │ (Context-aware rules) │ └──────────────────┬──────────────────┘ │ PASS ▼ ┌─────────────────────────────────────┐ │ Layer 3: ESCALATION │ ← Human approval for sensitive ops │ (Owner confirmation) │ └──────────────────┬──────────────────┘ │ APPROVED ▼ ┌─────────────────────────────────────┐ │ Layer 4: AUDIT LOG │ ← Every check is recorded │ (Immutable audit trail) │ └─────────────────────────────────────┘ ```

Step-by-Step Usage

Step 1: Initialize Constitution

```bash node constitution.js init --name "my-agent" --owner "[email protected]" ``` Creates .constitution/ directory with default rules and audit log.

Step 2: Add Rules

```bash

Immutable rule: never call external APIs without confirmation

node constitution.js rule add
--level immutable
--action deny
--scope "external_write"
--description "Never write to external APIs without owner confirmation"
--escalation owner

Owner-only rule: can modify workspace files

node constitution.js rule add
--level owner-only
--action allow
--scope "workspace_write"
--description "Can write files within workspace directory"

Mutable rule: can read any file

node constitution.js rule add
--level mutable
--action allow
--scope "file_read"
--description "Read any local file" ```

Step 3: Check Permissions Before Action

```javascript const guard = require('./constitution.js');

// Check if an action is allowed const decision = guard.check('external_write', { target: 'https://api.stripe.com/charges', payload: { amount: 9999 }, userId: 'user-123' });

console.log(decision); // { // allowed: false, // layer: 'immutable', // rule: 'R001', // reason: 'External write requires owner confirmation', // escalation: 'owner', // escalationMessage: 'Agent wants to POST to https://api.stripe.com/charges. Approve?' // } ```

Step 4: Handle Escalation

```javascript if (!decision.allowed && decision.escalation) { // Send escalation request to owner const approved = await guard.escalate(decision, { channel: 'webhook', // or 'email', 'slack', 'console' timeout: 300000, // 5 min timeout details: decision.escalationMessage });

if (approved) { await executeAction(); } } ```

Step 5: Review Audit Trail

```bash

View all decisions in last 24 hours

node constitution.js audit --last 24h

View only denied actions

node constitution.js audit --status denied

View audit for specific scope

node constitution.js audit --scope external_write

Export for compliance reporting

node constitution.js audit --export csv --output audit_2024_Q1.csv ```

Rule Levels Explained

Level	Who Can Modify	Override	Use Case
Immutable	Nobody	Never	Delete production, external network access, credential access
Owner-only	Agent owner only	Never	Deploy to production, modify billing, send emails
Mutable	Agent (within bounds)	Self-adjust	File read paths, log verbosity, cache settings
Advisory	Anyone	Always	Performance hints, optimization suggestions

Real-World Examples

Example 1: Production Database Protection

```json { "id": "DB-PROTECT", "level": "immutable", "action": "deny", "scope": ["database_delete", "database_drop", "database_truncate"], "description": "Never delete, drop, or truncate any production database", "conditions": { "environment": ["production", "prod"] } } ```

Example 2: Cost Control ($100/day API budget)

```json { "id": "COST-GUARD", "level": "owner-only", "action": "allow", "scope": "external_api_call", "description": "Allow external API calls within daily budget", "limits": { "maxDailyCost": 100, "maxCostPerCall": 10 }, "escalation": "owner" } ```

Example 3: Data Privacy (GDPR Compliance)

```json { "id": "GDPR-GUARD", "level": "immutable", "action": "deny", "scope": ["data_export", "data_share"], "description": "Never export or share PII data without legal approval", "conditions": { "dataTypes": ["email", "phone", "ssn", "address", "financial"] } } ```

Configuration Reference

```json { "constitution": { "version": "1.0", "agent": "my-agent", "owner": "[email protected]", "defaults": { "denyAction": "block", "logLevel": "all", "escalationTimeout": 300000 }, "layers": { "immutable": { "enabled": true, "log": true }, "policy": { "enabled": true, "log": true }, "escalation": { "enabled": true, "channels": ["console"] }, "audit": { "enabled": true, "retention": "90d" } } } } ```

Integration Patterns

Pattern 1: Middleware for Express/Fastify

```javascript app.use(async (req, res, next) => { const decision = guard.check('external_write', { method: req.method, url: req.url }); if (!decision.allowed) return res.status(403).json(decision); next(); }); ```

Pattern 2: OpenClaw Skill Wrapper

```javascript // Before executing any tool call: const toolGuard = guard.checkForTool(toolName, toolParams); if (!toolGuard.allowed) { if (toolGuard.escalation === 'owner') { // Ask OpenClaw to prompt user for approval } return { blocked: true, reason: toolGuard.reason }; } ```

Pattern 3: CI/CD Pipeline Gate

```bash

In your deployment pipeline:

node constitution.js ci-check --env production --strict

Exit code 0 = safe to deploy, 1 = violations found

```

安全使用建议

This package is an instructional README showing how a Node-based 'constitution' system could work, but it contains no code or install and does not declare the credentials or binaries it expects. Do not assume the skill enforces anything by itself. If you want these guardrails: obtain the actual implementation from a trusted repository, review the source code (especially escalation/webhook/email/Slack code and audit-log storage), verify where audit logs are written and who can read them, provision required credentials securely (do not hard-code secrets), and only enable autonomous invocation after code review and platform-level enforcement is available. If you expected a working plugin, ask the publisher for the repository or packaged artifacts and refuse to run undocumented node scripts or copy unverified code into your environment.

功能分析

Type: OpenClaw Skill Name: agent-constitution-guard Version: 1.0.1 The skill bundle provides documentation and metadata for a safety-oriented framework designed to enforce behavioral guardrails and permission boundaries for AI agents. The SKILL.md file outlines legitimate security patterns such as multi-layered rule enforcement, human-in-the-loop escalation, and audit logging. While the primary execution logic (constitution.js) is referenced but not included in the provided files, the documentation and instructions show no signs of malicious intent, data exfiltration, or harmful prompt injection.

能力评估

⚠ Purpose & Capability

The skill claims to provide an enforceable, auditable constitutional guardrail system (Node runtime examples, init/audit/escalation commands) but the bundle contains no code, no install spec, and lists no required binaries. Running the documented commands (node constitution.js ...) is impossible as-is. Declared purpose (platform-level enforcement) does not match what an instruction-only README can deliver.

⚠ Instruction Scope

Runtime instructions tell the agent/user to create .constitution, run Node scripts, check actions, escalate via webhook/email/slack, and read/write audit logs and files (examples include 'read any local file'). These instructions involve file I/O and external network communication but the skill does not declare the necessary configuration, credentials, or code to perform them. The guidance is broad/vague (e.g., escalation channels) and could lead to ad-hoc implementations that leak data or misuse credentials.

ℹ Install Mechanism

No install spec or code files are present (instruction-only). That minimizes immediate supply-chain risk but also means the skill provides only documentation — there is no built binary or package to enforce guardrails. The lack of an install mechanism is coherent with an example/tutorial but conflicts with the 'production-grade' enforcement claims.

⚠ Credentials

The SKILL.md demonstrates escalation via webhooks, email, and Slack and shows interacting with external APIs (e.g., Stripe) yet the registry metadata declares no required environment variables or credentials. Escalation channels and external API calls require secrets/URLs (webhook endpoints, SMTP/Slack tokens, API keys); omitting these from the manifest is inconsistent and dangerous because users may copy/paste examples without securely provisioning credentials.

ℹ Persistence & Privilege

always:false and default autonomous invocation are reasonable. However the skill's claim of 'immutable' rules that 'cannot be overridden by anyone' is a policy/organizational assertion — there is no platform-level mechanism documented here to guarantee immutability. That mismatch between claimed privilege and actual enforcement should be treated as misleading.

版本历史

v1.0.1

**Expanded features and compliance for agent constitutional guardrails** - Description and keywords updated for clarity, adding emphasis on escalation, audit, and compliance. - Skill metadata now includes author, license, and enriched detail about protection against prompt injection and social engineering. - Added architectural overview, illustrating multi-layered checks (immutable, policy, escalation, audit). - Usage instructions are more detailed, covering initialization, rule creation, permission checks, escalation flows, and audit review. - Explained rule levels (Immutable, Owner-only, Mutable, Advisory) in a clear comparison table. - Introduced real-world example rule configurations for production safety, cost control, and data privacy/GDPR. - Config reference and practical integration patterns for middleware and OpenClaw provided.

v1.0.0

Initial release: Introduces a constitutional guardrail system for AI agents to enforce behavioral constraints and permission boundaries. - Defines immutable, owner-only, and mutable rules to control agent actions. - Provides permission checking and escalation (auto-deny, ask-owner, log-only, allow). - Includes an audit trail for all permission checks. - Offers configuration via JSON and supports use cases like safety, privacy, and compliance. - Simple API and integration examples for enforcing agent guardrails.

元数据

Slug agent-constitution-guard

版本 1.0.1

许可证 MIT-0

累计安装 1

当前安装数 1

历史版本数 2

常见问题