Description

Safety monitoring and tripwire detection for AI agents. Protects against unauthorized file access, dangerous commands, and excessive activity. Auto-halts on...

README (SKILL.md)

\r \r

Canary Agent Safety Tripwire System\r

Name: Canary
Author: theshadowrose

\r Safety monitoring and tripwire detection for AI agents. Protects against unauthorized file access, dangerous commands, and excessive activity. Auto-halts on critical violations. Honeypot tripwires detect snooping.\r \r ---\r \r Safety monitoring and tripwire detection for AI agents.\r \r Protects against unauthorized file access, dangerous commands, and excessive activity. Auto-halts on critical violations. Honeypot tripwires detect snooping.\r \r ---\r \r

What It Does\r

\r Canary provides three layers of agent safety:\r \r

Action Monitoring - Checks file paths and commands before execution\r
Tripwire Files - Honeypot files that should never be accessed\r
Audit Trail - Complete logs and pattern detection\r \r

Core Features\r

\r Protected Paths:\r

Block access to sensitive directories (/etc/, ~/.ssh/, etc.)\r
Customizable protection list\r
Granular operation control (read, write, delete)\r \r Forbidden Patterns:\r
Regular expression matching for dangerous commands\r
Detects rm -rf /, chmod 777, curl | sh, etc.\r
Extensible pattern library\r \r Rate Limiting:\r
Limit file operations, network requests, command executions\r
Configurable windows and thresholds\r
Prevents runaway agents\r \r Auto-Halt:\r
Automatically stops agent after violation threshold\r
Prevents cascading failures\r
Requires manual review to restart\r \r Tripwire Files:\r
Create honeypot files that should never be accessed\r
Detect modifications, deletions, or access\r
Hash verification for file integrity\r \r Audit Trail:\r
Complete action logs\r
Violation history\r
Pattern detection (rapid violations, repeated targets, time clusters)\r
Export to JSON or Markdown\r \r ---\r \r

Quick Start\r

\r

Install\r

\r No dependencies! Python 3.7+ stdlib only.\r \r

# Copy config example\r
cp config_example.json config.json\r
\r
# Edit config with your protected paths\r
nano config.json\r
```\r
\r
### Basic Usage\r
\r
```python\r
from canary import CanaryMonitor\r
\r
# Initialize monitor\r
canary = CanaryMonitor('config.json')\r
\r
# Check path before access\r
is_safe, reason = canary.check_path('/etc/passwd', 'read')\r
if not is_safe:\r
    print(f"Blocked: {reason}")\r
    exit(1)\r
\r
# Check command before execution\r
is_safe, reason = canary.check_command('rm -rf /')\r
if not is_safe:\r
    print(f"Blocked: {reason}")\r
    exit(1)\r
\r
# Get status\r
status = canary.get_status()\r
print(f"Violations: {status['violation_count']}/{status['halt_threshold']}")\r
```\r
\r
### CLI Usage\r
\r
```bash\r
# Check status\r
python3 canary.py status\r
\r
# Check if path is safe\r
python3 canary.py check-path --path /etc/passwd --operation read\r
\r
# Check if command is safe\r
python3 canary.py check-command --command "rm -rf /"\r
\r
# Reset monitoring (clears violations)\r
python3 canary.py reset\r
```\r
\r
---\r
\r
## Tripwire Files\r
\r
Create honeypot files that should never be accessed:\r
\r
```bash\r
# Create tripwire\r
python3 canary_tripwire.py create \\r
  --path ~/.secrets/fake-api-key.txt \\r
  --severity critical \\r
  --description "Honeypot to detect credential snooping"\r
\r
# List all tripwires\r
python3 canary_tripwire.py list\r
\r
# Check for triggered tripwires\r
python3 canary_tripwire.py check\r
\r
# View alert history\r
python3 canary_tripwire.py alerts --limit 10\r
\r
# Remove tripwire\r
python3 canary_tripwire.py remove --path ~/.secrets/fake-api-key.txt\r
```\r
\r
### Python API\r
\r
```python\r
from canary_tripwire import TripwireManager\r
\r
manager = TripwireManager()\r
\r
# Create tripwire\r
manager.create_tripwire(\r
    path='~/.aws/fake-credentials',\r
    severity='critical',\r
    description='Detects AWS credential access'\r
)\r
\r
# Check all tripwires\r
triggered = manager.check_tripwires()\r
if triggered:\r
    print(f"⚠️  {len(triggered)} tripwire(s) triggered!")\r
    for alert in triggered:\r
        print(f"  - {alert['path']}: {alert['event']}")\r
```\r
\r
---\r
\r
## Audit Reports\r
\r
Analyze logs and generate safety reports:\r
\r
```bash\r
# Summary report\r
python3 canary_audit.py summary\r
\r
# View violations by severity\r
python3 canary_audit.py violations --severity critical\r
\r
# Timeline of recent events\r
python3 canary_audit.py timeline --hours 24\r
\r
# Detect suspicious patterns\r
python3 canary_audit.py patterns\r
\r
# Export full report\r
python3 canary_audit.py export --output report.json --format json\r
python3 canary_audit.py export --output report.md --format markdown\r
```\r
\r
### Python API\r
\r
```python\r
from canary_audit import CanaryAuditor\r
\r
auditor = CanaryAuditor('canary.log')\r
\r
# Generate summary\r
summary = auditor.generate_summary_report()\r
print(f"Total violations: {summary['total_violations']}")\r
\r
# Get critical violations\r
critical = auditor.get_violations_by_severity('critical')\r
\r
# Detect patterns\r
patterns = auditor.detect_patterns()\r
if patterns['rapid_violations']:\r
    print("⚠️  Rapid violation sequence detected!")\r
\r
# Export report\r
auditor.export_report('safety-report.md', format='markdown')\r
```\r
\r
---\r
\r
## Configuration\r
\r
See `config_example.json` for all options.\r
\r
### Essential Settings\r
\r
```json\r
{\r
  "protected_paths": [\r
    "/etc/",\r
    "~/.ssh/",\r
    "~/critical-data/"\r
  ],\r
  "forbidden_patterns": [\r
    "rm\\s+-rf\\s+/",\r
    "chmod\\s+777",\r
    "curl.*\\|\\s*sh"\r
  ],\r
  "halt_threshold": 5,\r
  "rate_limits": {\r
    "file_operations": {"limit": 100, "window": 60},\r
    "command_executions": {"limit": 20, "window": 60}\r
  }\r
}\r
```\r
\r
---\r
\r
## Integration Examples\r
\r
### With Agent Runtime\r
\r
```python\r
from canary import CanaryMonitor\r
\r
canary = CanaryMonitor('config.json')\r
\r
def safe_file_read(path):\r
    """Read file with Canary check."""\r
    is_safe, reason = canary.check_path(path, 'read')\r
    if not is_safe:\r
        raise PermissionError(reason)\r
    \r
    with open(path, 'r') as f:\r
        return f.read()\r
\r
def safe_command(cmd):\r
    """Execute command with Canary check."""\r
    is_safe, reason = canary.check_command(cmd)\r
    if not is_safe:\r
        raise PermissionError(reason)\r
    \r
    import subprocess\r
    cmd_list = cmd.split() if isinstance(cmd, str) else cmd\r
    return subprocess.run(cmd_list, capture_output=True)\r
```\r
\r
### Pre-Deployment Checks\r
\r
```python\r
# Before deploying agent, verify Canary setup\r
from canary import CanaryMonitor\r
\r
canary = CanaryMonitor('config.json')\r
\r
# Verify protected paths are configured\r
status = canary.get_status()\r
if status['protected_paths_count'] == 0:\r
    print("⚠️  No protected paths configured!")\r
    exit(1)\r
\r
# Test tripwire detection\r
from canary_tripwire import TripwireManager\r
manager = TripwireManager()\r
\r
# Create test tripwire\r
manager.create_tripwire('/tmp/canary-test.txt', severity='high')\r
\r
# Verify it exists\r
triggered = manager.check_tripwires()\r
if not any(t['path'] == '/tmp/canary-test.txt' for t in triggered):\r
    print("✅ Tripwire system operational")\r
\r
# Cleanup\r
manager.remove_tripwire('/tmp/canary-test.txt', delete_file=True)\r
```\r
\r
---\r
\r
## Use Cases\r
\r
### 1. Autonomous Agent Safety\r
\r
Deploy Canary alongside autonomous agents to prevent:\r
- Accidental system file deletion\r
- Credential exfiltration\r
- Runaway command execution\r
\r
### 2. Multi-Agent Systems\r
\r
Each agent gets its own Canary instance with custom rules:\r
- Research agent: limited network access\r
- Coding agent: no production deployments\r
- Admin agent: full access but strict audit\r
\r
### 3. Development/Testing\r
\r
Use Canary during agent development:\r
- Catch dangerous patterns early\r
- Test rate limiting behavior\r
- Verify safety mechanisms work\r
\r
### 4. Production Monitoring\r
\r
Run Canary in production:\r
- Real-time violation alerts\r
- Audit trail for compliance\r
- Pattern detection for anomalies\r
\r
---\r
\r
## Architecture\r
\r
```\r
┌─────────────────┐\r
│   Your Agent    │\r
└────────┬────────┘\r
         │\r
         ▼\r
┌─────────────────┐      ┌──────────────────┐\r
│ CanaryMonitor   │◄────►│  config.py       │\r
│ (canary.py)     │      │  (your rules)    │\r
└────────┬────────┘      └──────────────────┘\r
         │\r
         ├─────► canary.log (action log)\r
         │\r
         ▼\r
┌─────────────────┐      ┌──────────────────┐\r
│ TripwireManager │◄────►│ .canary_tripwires│\r
│ (tripwire.py)   │      │ (honeypot files) │\r
└────────┬────────┘      └──────────────────┘\r
         │\r
         └─────► alerts.log\r
         \r
         \r
┌─────────────────┐\r
│ CanaryAuditor   │───► reports (JSON/MD)\r
│ (audit.py)      │\r
└─────────────────┘\r
```\r
\r
---\r
\r
## Best Practices\r
\r
### Start Conservative\r
\r
Begin with strict rules, relax as needed:\r
\r
```python\r
protected_paths = [\r
    '/',  # Protect entire filesystem initially\r
]\r
\r
halt_threshold = 3  # Low threshold to catch issues early\r
```\r
\r
### Use Tripwires Strategically\r
\r
Place tripwires in sensitive locations:\r
- Fake credential files\r
- Empty "secrets" directories\r
- Decoy config files\r
\r
### Review Logs Regularly\r
\r
```bash\r
# Daily audit\r
python3 canary_audit.py summary\r
\r
# Weekly deep dive\r
python3 canary_audit.py patterns\r
python3 canary_audit.py export --output weekly-report.md --format markdown\r
```\r
\r
### Test Your Configuration\r
\r
```python\r
# Verify Canary blocks what it should\r
canary = CanaryMonitor('config.json')\r
\r
# These should all be blocked\r
assert not canary.check_path('/etc/passwd', 'delete')[0]\r
assert not canary.check_command('rm -rf /')[0]\r
assert not canary.check_command('chmod 777 /tmp')[0]\r
\r
print("✅ Canary configuration verified")\r
```\r
\r
---\r
\r
## Limitations\r
\r
See [LIMITATIONS.md](LIMITATIONS.md) for details.\r
\r
**Key constraints:**\r
- Pattern matching is regex-based (not semantic analysis)\r
- No built-in alerting (logs only)\r
- Tripwires detect access, not intent\r
- Rate limiting is per-session (doesn't survive restarts)\r
\r
---\r
\r
## License\r
\r
MIT License - See [LICENSE](LICENSE)\r
\r
**Author:** Shadow Rose\r
\r
---\r
\r
## Why This Exists\r
\r
AI agents can do a lot of damage quickly:\r
- One bad command can delete critical files\r
- Runaway loops can exhaust resources\r
- Compromised agents can exfiltrate credentials\r
\r
Canary provides defense-in-depth:\r
- **Preventive:** Block dangerous actions before they happen\r
- **Detective:** Tripwires catch snooping behavior\r
- **Forensic:** Complete audit trail for post-incident analysis\r
\r
Simple, zero-dependency safety for autonomous agents.\r
\r
\r
---\r
\r
\r
## ⚠️ Security Note — Config File\r
\r
Configuration is loaded from a JSON file. This is safe to share — no code execution.\r
\r
- Config path is validated for existence and size (1MB cap) before loading\r
- Must be a `.json` file — `CanaryMonitor` raises `ValueError` if given a non-JSON path\r
- Keep your config under version control; treat it as security policy\r
\r
## ⚠️ Security Note — Tripwire Deployment\r
\r
- **Paths are fully resolved** — `~` and relative paths are expanded via `Path.expanduser().resolve()` before creation and lookup. `'~/.aws/fake-credentials'` will be placed in your actual home directory, not a literal `~` path.\r
- **Use decoy paths only** — never point tripwires at real files containing sensitive data. Tripwires are honeypots; treat them as bait, not protection.\r
- **`create_tripwire` will not overwrite existing files** — it checks for pre-existing files and refuses to proceed. Use dedicated empty paths for tripwires.\r
- **Test in a sandbox first** — verify where logs, tripwires, and registry files are created before deploying. Confirm protected paths and auto-halt behavior in an isolated environment.\r
- **Protect log and alert directories** — set filesystem permissions so alert logs are not world-readable. Canary writes plaintext logs; restrict access accordingly.\r
- **Canary only blocks when called** — it is not an OS-level enforcement mechanism. Layer it with containers, filesystem permissions, and `auditd` for production deployments.\r
\r
## ⚠️ Disclaimer\r
\r
This software is provided "AS IS", without warranty of any kind, express or implied.\r
\r
**USE AT YOUR OWN RISK.**\r
\r
- The author(s) are NOT liable for any damages, losses, or consequences arising from \r
  the use or misuse of this software — including but not limited to financial loss, \r
  data loss, security breaches, business interruption, or any indirect/consequential damages.\r
- This software does NOT constitute financial, legal, trading, or professional advice.\r
- Users are solely responsible for evaluating whether this software is suitable for \r
  their use case, environment, and risk tolerance.\r
- No guarantee is made regarding accuracy, reliability, completeness, or fitness \r
  for any particular purpose.\r
- The author(s) are not responsible for how third parties use, modify, or distribute \r
  this software after purchase.\r
\r
By downloading, installing, or using this software, you acknowledge that you have read \r
this disclaimer and agree to use the software entirely at your own risk.\r
\r
\r
**SECURITY DISCLAIMER:** This software provides supplementary security measures and \r
is NOT a replacement for professional security auditing, penetration testing, or \r
compliance frameworks. No software can guarantee complete protection against all \r
threats. Users operating in regulated industries (healthcare, finance, legal) should \r
consult qualified security professionals and verify compliance with applicable \r
regulations (GDPR, HIPAA, SOC2, etc.) independently.\r
---\r
\r
## Support & Links\r
\r
| | |\r
|---|---|\r
| 🐛 **Bug Reports** | [email protected] |\r
| ☕ **Ko-fi** | [ko-fi.com/theshadowrose](https://ko-fi.com/theshadowrose) |\r
| 🛒 **Gumroad** | [shadowyrose.gumroad.com](https://shadowyrose.gumroad.com) |\r
| 🐦 **Twitter** | [@TheShadowyRose](https://twitter.com/TheShadowyRose) |\r
| 🐙 **GitHub** | [github.com/TheShadowRose](https://github.com/TheShadowRose) |\r
| 🧠 **PromptBase** | [promptbase.com/profile/shadowrose](https://promptbase.com/profile/shadowrose) |\r
\r
*Built with [OpenClaw](https://github.com/openclaw/openclaw) — thank you for making this possible.*\r
\r
---\r
\r
🛠️ **Need something custom?** Custom OpenClaw agents & skills starting at $500. If you can describe it, I can build it. → [Hire me on Fiverr](https://www.fiverr.com/s/jjmlZ0v)\r
\r
> **📦 Install note:** The slug `canary` was already taken on ClawHub.\r
> Install this skill using: `clawhub install canary-sr`\r

Usage Guidance

This package appears to be what it claims: a local, regex-based safety monitor and tripwire manager. Before installing, consider: (1) it will create and write files in your working directory and under your home (canary.log, .canary_tripwires, any tripwire paths you create) — review and choose safe tripwire locations; (2) it uses regex matching only and can be bypassed by obfuscation/encodings or by an agent that ignores the checks (LIMITATIONS.md documents this); (3) it does not send logs off-host by default (no network calls in provided code), but you should review any modifications if you add alerting hooks; (4) run agents with least privilege (container/limited user) and confirm your agent actually invokes Canary before executing sensitive actions. If you need enforcement beyond advisory checks, add OS-level sandboxing or auditing (auditd, containers).

Capability Analysis

Type: OpenClaw Skill Name: canary-sr Version: 1.0.8 The 'Canary Agent Safety Tripwire System' is a security-focused utility designed to monitor AI agent activity, block dangerous commands via regex, and manage honeypot files. The codebase (canary.py, canary_tripwire.py, canary_audit.py) is well-structured, uses only Python standard libraries, and includes transparent documentation regarding its limitations. No evidence of data exfiltration, malicious execution, or prompt injection was found; the tool's high-privilege actions (file creation and path checking) are strictly aligned with its stated purpose of providing defense-in-depth for autonomous agents.

Capability Assessment

✓ Purpose & Capability

Name/description, SKILL.md examples, and included Python modules (canary.py, canary_tripwire.py, canary_audit.py, config examples) all align: functionality is focused on path/command pattern checks, tripwire honeypots, rate limiting and audit logging. There are no unrelated environment variables, cloud credentials, or external services required that would be disproportionate to the stated purpose.

✓ Instruction Scope

Runtime instructions are narrowly scoped: call check_path/check_command, create and check tripwires, and run audit scripts. The SKILL.md does direct creation of honeypot files and writing logs/registries under the user's filesystem (config.json, canary.log, .canary_tripwires), which is expected behavior for a tripwire/audit tool and is documented in LIMITATIONS.md.

✓ Install Mechanism

No install spec; it's an instruction-only skill bundled with Python source. The code claims to use only Python 3.7+ stdlib and the files provided match that claim (no external package imports). No downloads, package installs, or remote executable fetches are present.

✓ Credentials

The skill requests no environment variables or credentials. It does operate on filesystem paths (including sensitive locations when you choose to place tripwires there) and writes local log/registry files; those filesystem actions are proportional to a tripwire/audit tool but worth noting because tripwires deliberately target sensitive locations like ~/.aws/ as part of their purpose.

ℹ Persistence & Privilege

The skill persists state and logs to local files (canary.log, .canary_tripwires/registry.json, alerts.log). It does not request elevated OS privileges or try to modify other skills or system-wide agent settings. Note that 'always' is false and the agent must be written to call Canary checks — Canary does not enforce kernel-level sandboxing.

Version History

v1.0.8

- Version bump from 1.0.7 to 1.0.8 in SKILL.md. - No other functional or documentation changes detected.

v1.0.7

- Config and usage now use JSON format instead of Python for all configuration files (e.g., `config.json` replaces `config.py`). - Updated documentation, code samples, and quick start instructions to reflect JSON-based config. - Minor documentation improvements and example clarifications in SKILL.md. - Version bump to 1.0.7.

v1.0.6

Canary-SR 1.0.6 introduces improved configuration clarity and updates. - Added example configuration file: config_example.json for easier setup. - Updated documentation to reflect version 1.0.6. - Improved and clarified configuration instructions and usage in SKILL.md. - Minor documentation fixes and consistency improvements.

v1.0.5

**Canary Agent Safety Tripwire System v1.0.5** - Version bump from 1.0.4 to 1.0.5 in SKILL.md. - No user-facing feature or documentation changes detected.

v1.0.4

- Added a slug field to the metadata for better identification and usage. - Fixed name to "Canary Agent Safety Tripwire System" (removed encoding artifact). - Updated version number to 1.0.4. - No feature or logic changes; this update is documentation/metadata only.

v1.0.3

- Updated version number in SKILL.md from 1.0.2 to 1.0.3. - No other feature, configuration, or usage changes. - Documentation and usage examples remain unchanged.

v1.0.2

- Updated version to 1.0.2. - Improved the example for safe command execution in the integration section: now splits the command string before passing to subprocess for better reliability. - No other user-facing changes documented.

v1.0.0

Initial upload

Metadata

Slug canary-sr

Version 1.0.8

License MIT-0

All-time Installs 1

Active Installs 1

Total Versions 8

Frequently Asked Questions

What is Canary?

Safety monitoring and tripwire detection for AI agents. Protects against unauthorized file access, dangerous commands, and excessive activity. Auto-halts on... It is an AI Agent Skill for Claude Code / OpenClaw, with 366 downloads so far.

How do I install Canary?

Run "/install canary-sr" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Canary free?

Yes, Canary is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Canary support?

Canary is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Canary?

It is built and maintained by Shadow Rose (@theshadowrose); the current version is v1.0.8.

More Skills

Canary