Description

Output sanitization for agent responses - prevents accidental secret leaks

README (SKILL.md)

arc-shield

Name: Arc Shield
Author: arc-claw-bot

Output sanitization for agent responses. Scans ALL outbound messages for leaked secrets, tokens, keys, passwords, and PII before they leave the agent.

⚠️ This is NOT an input scanner — clawdefender already handles that. This is an OUTPUT filter for catching things your agent accidentally includes in its own responses.

Why You Need This

Agents have access to sensitive data: 1Password vaults, environment variables, config files, wallet keys. Sometimes they accidentally include these in responses when:

Debugging and showing full command output
Copying file contents that contain secrets
Generating code examples with real credentials
Summarizing logs that include tokens

Arc-shield catches these leaks before they reach Discord, Signal, X, or any external channel.

What It Detects

🔴 CRITICAL (blocks in `--strict` mode)

API Keys & Tokens: 1Password (ops_*), GitHub (ghp_*), OpenAI (sk-*), Stripe, AWS, Bearer tokens
Passwords: Assignments like password=... or passwd: ...
Private Keys: Ethereum (0x + 64 hex), SSH keys, PGP blocks
Wallet Mnemonics: 12/24 word recovery phrases
PII: Social Security Numbers, credit card numbers
Platform Tokens: Slack, Telegram, Discord

🟠 HIGH (warns loudly)

High-entropy strings: Shannon entropy > 4.5 for strings > 16 chars (catches novel secret patterns)
Credit cards: 16-digit card numbers
Base64 credentials: Long base64 strings that look like tokens

🟡 WARN (informational)

Secret file paths: ~/.secrets/*, paths containing "password", "token", "key"
Environment variables: ENV_VAR=secret_value exports
Database URLs: Connection strings with credentials

Installation

cd ~/.openclaw/workspace/skills
git clone \x3Carc-shield-repo> arc-shield
chmod +x arc-shield/scripts/*.sh arc-shield/scripts/*.py

Or download as a skill bundle.

Usage

Command-line

# Scan agent output before sending
agent-response.txt | arc-shield.sh

# Block if critical secrets found (use before external messaging)
echo "Message text" | arc-shield.sh --strict || echo "BLOCKED"

# Redact secrets and return sanitized text
cat response.txt | arc-shield.sh --redact

# Full report
arc-shield.sh --report \x3C conversation.log

# Python version with entropy detection
cat message.txt | output-guard.py --strict

Integration with OpenClaw Agents

Pre-send hook (recommended)

Add to your messaging skill or wrapper:

#!/bin/bash
# send-message.sh wrapper

MESSAGE="$1"
CHANNEL="$2"

# Sanitize output
SANITIZED=$(echo "$MESSAGE" | arc-shield.sh --strict --redact)
EXIT_CODE=$?

if [[ $EXIT_CODE -eq 1 ]]; then
    echo "ERROR: Message contains critical secrets and was blocked." >&2
    exit 1
fi

# Send sanitized message
openclaw message send --channel "$CHANNEL" "$SANITIZED"

Manual pipe

Before any external message:

# Generate response
RESPONSE=$(agent-generate-response)

# Sanitize
CLEAN=$(echo "$RESPONSE" | arc-shield.sh --redact)

# Send
signal send "$CLEAN"

Testing

cd skills/arc-shield/tests
./run-tests.sh

Includes test cases for:

Real leaked patterns (1Password tokens, Instagram passwords, wallet mnemonics)
False positive prevention (normal URLs, email addresses, file paths)
Redaction accuracy
Strict mode blocking

Configuration

Patterns are defined in config/patterns.conf:

CRITICAL|GitHub PAT|ghp_[a-zA-Z0-9]{36,}
CRITICAL|OpenAI Key|sk-[a-zA-Z0-9]{20,}
WARN|Secret Path|~\/\.secrets\/[^\s]*

Edit to add custom patterns or adjust severity levels.

Modes

Mode	Behavior	Exit Code	Use Case
Default	Pass through + warnings to stderr	0	Development, logging
`--strict`	Block on CRITICAL findings	1 if critical	Production outbound messages
`--redact`	Replace secrets with `[REDACTED:TYPE]`	0	Safe logging, auditing
`--report`	Analysis only, no pass-through	0	Auditing conversations

Entropy Detection

The Python version (output-guard.py) includes Shannon entropy analysis to catch secrets that don't match regex patterns:

# Detects high-entropy strings like:
kJ8nM2pQ5rT9vWxY3zA6bC4dE7fG1hI0  # Novel API key format
Zm9vOmJhcg==                      # Base64 credentials

Threshold: 4.5 bits (configurable with --entropy-threshold)

Performance

Bash version: ~10ms for typical message (\x3C 1KB)
Python version: ~50ms with entropy analysis
Zero external dependencies: bash + Python stdlib only

Fast enough to run on every outbound message without noticeable delay.

Real-World Catches

From our own agent sessions:

# 1Password token
"ops_eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."

# Instagram password in debug output
"instagram login: [email protected] / MyInsT@Gr4mP4ss!"

# Wallet mnemonic in file listing
"cat ~/.secrets/wallet-recovery-phrase.txt
abandon ability able about above absent absorb abstract..."

# GitHub PAT in git config
"[remote "origin"]
url = https://ghp_abc123:@github.com/user/repo"

All blocked by arc-shield before reaching external channels.

Best Practices

Always use --strict for external messages (Discord, Signal, X, email)
Use --redact for logs you want to review later
Run tests after adding custom patterns to check for false positives
Pipe through both bash and Python versions for maximum coverage:
```
message | arc-shield.sh --strict | output-guard.py --strict
```
Don't rely on this alone — educate your agent to avoid including secrets in the first place (see AGENTS.md output sanitization directive)

Limitations

Context-free: Can't distinguish between "here's my password: X" (bad) and "set your password to X" (instruction)
No semantic understanding: Won't catch "my token is in the previous message"
Pattern-based: New secret formats require pattern updates

Use in combination with agent instructions and careful prompt engineering.

Integration Example

Full OpenClaw agent integration:

# In your agent's message wrapper
send_external_message() {
    local message="$1"
    local channel="$2"
    
    # Pre-flight sanitization
    if ! echo "$message" | arc-shield.sh --strict > /dev/null 2>&1; then
        echo "ERROR: Message blocked by arc-shield (contains secrets)" >&2
        return 1
    fi
    
    # Double-check with entropy detection
    if ! echo "$message" | output-guard.py --strict > /dev/null 2>&1; then
        echo "ERROR: High-entropy secret detected" >&2
        return 1
    fi
    
    # Safe to send
    openclaw message send --channel "$channel" "$message"
}

Troubleshooting

False positives on normal text:

Adjust entropy threshold: output-guard.py --entropy-threshold 5.0
Edit config/patterns.conf to refine regex patterns
Add exceptions to the pattern file

Secrets not detected:

Check pattern file for coverage
Run with --report to see what's being scanned
Test with tests/run-tests.sh using your sample
Consider lowering entropy threshold (but watch for false positives)

Performance issues:

Use bash version only (skip entropy detection)
Limit input size with head -c 10000
Run in background: arc-shield.sh --report &

Contributing

Add new patterns to config/patterns.conf following the format:

SEVERITY|Category Name|regex_pattern

Test with tests/run-tests.sh before deploying.

License

MIT — use freely, protect your secrets.

Remember: Arc-shield is your safety net, not your strategy. Train your agent to never include secrets in responses. This tool catches mistakes, not malice.

Usage Guidance

Arc Shield appears to be a coherent, local output-sanitizer that uses regex and entropy heuristics. Before installing: 1) Verify the config/patterns.conf file is present (docs reference it but it wasn't in the manifest); without it detection may be incomplete. 2) Review scripts (scripts/arc-shield.sh and scripts/output-guard.py) yourself — they operate locally and do not call external network endpoints, but you should confirm there are no edits that add remote posting. 3) Run the included tests (./tests/quick-test.sh) in a safe environment to validate behavior and tune patterns/entropy threshold to avoid false positives. 4) When integrating as a pre-send hook, ensure internal channels are excluded (examples show this) and check where blocked attempts are logged (e.g., ~/.openclaw/logs/arc-shield-blocks.log) to avoid leaking sensitive metadata. 5) Don’t rely on this alone — continue to train agents to avoid emitting secrets and treat redaction as a safety net. If you want higher assurance, request the missing config file and full, un-truncated copies of the scripts so you can audit the complete code paths.

Capability Analysis

Type: OpenClaw Skill Name: arc-shield Version: 1.0.0 The OpenClaw AgentSkills skill bundle 'arc-shield' is a security tool designed for output sanitization, preventing accidental secret leaks from AI agent responses. All code (`arc-shield.sh`, `output-guard.py`) and documentation (e.g., `SKILL.md`, `README.md`) consistently implement this purpose. There is no evidence of data exfiltration, malicious execution, persistence, or obfuscation. The markdown files contain no prompt injection attempts against the agent; all instructions are clear and directly related to using the skill for its stated security function. The tool operates by scanning input (agent responses) from stdin and outputs to stdout/stderr, with no external network calls or unauthorized file system access.

Capability Assessment

ℹ Purpose & Capability

The skill's name, description, and included scripts (Bash + Python) align with an output-sanitization purpose. Requested runtime (bash, python3) is appropriate. One minor inconsistency: documentation refers to config/patterns.conf as the pattern database but that file is not present in the provided file manifest — the scripts default to loading ../config/patterns.conf, so installation will need that file (docs claim it exists).

✓ Instruction Scope

SKILL.md and examples limit actions to scanning/sanitizing outbound messages, running locally, and integrating as a pre-send hook or wrapper. The instructions do not tell the agent to read unrelated system files or to transmit data to external endpoints. Integration examples do append a local log entry (~/.openclaw/logs/arc-shield-blocks.log) when blocking — this is reasonable and documented.

ℹ Install Mechanism

There is no automated install spec (instruction-only install via git clone / manual copy), which is low-risk. The scripts themselves make no external network calls. Note: the documentation and code expect a config/patterns.conf file; that file is referenced but not present in the provided manifest — you'll need to ensure that config is supplied when installing.

✓ Credentials

The skill requests no environment variables or credentials. It analyzes text passed to it rather than reading secrets from the environment. It does not require unrelated credentials, so the requested environment access is proportionate.

✓ Persistence & Privilege

The skill is not force-included (always:false) and does not modify other skills or system components automatically. Integration requires the operator to add a pre-send hook or wrapper, which is a deliberate, user-controlled action. Example hooks log blocked attempts to a local path — this is described in the docs.

Version History

v1.0.0

arc-shield 1.0.0 initial release - Adds outbound message sanitization to block accidental leaks of secrets, API keys, passwords, tokens, wallet mnemonics, and PII. - Supports multiple scan modes: pass-through, strict blocking, redaction, and reporting. - Pattern-based detection for major secret types (1Password, GitHub, OpenAI, AWS, Slack, Discord, etc) plus high-entropy detection (Python). - Provides easy shell and Python integration for OpenClaw agents via wrappers and hooks. - Fully configurable patterns and severity in a single config file. - Includes comprehensive CLI, test suite, and best practices for safe agent messaging.

Metadata

Slug arc-shield

Version 1.0.0

License —

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Arc Shield?

Output sanitization for agent responses - prevents accidental secret leaks. It is an AI Agent Skill for Claude Code / OpenClaw, with 968 downloads so far.

How do I install Arc Shield?

Run "/install arc-shield" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Arc Shield free?

Yes, Arc Shield is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Arc Shield support?

Arc Shield is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Arc Shield?

It is built and maintained by arc-claw-bot (@arc-claw-bot); the current version is v1.0.0.

More Skills

Arc Shield