Description

Preflight security scanner for AI coding agents — scans deployment config, skills/MCP servers, memory/sessions, and AI agent config files (hooks injection) f...

README (SKILL.md)

DeepSafe Scan — Preflight Security Scanner for AI Coding Agents

Name: Deepsafe Scan
Author: xiaoyiweio

Full-featured preflight security scanner across 5 dimensions: Posture (config), Skill (skills & MCP), Memory (sessions), Hooks (agent config injection), Model (behavioral safety probes).

Works with OpenClaw, Claude Code, Cursor, and Codex. LLM features auto-detect credentials — no manual configuration needed.

When to Use

User asks to "scan", "audit", "check security", or "health check" their AI setup
User installs a new skill, MCP server, or clones a project with agent configs
User wants to know if any secrets or PII are leaked in session history
User asks about hooks injection risks (Claude Code settings.json, .cursorrules, etc.)
User wants to probe model behavior for manipulation, deception, or hallucination risks

How to Run

Quick static scan (no API key needed)

python3 {baseDir}/scripts/scan.py --modules posture,skill,memory,hooks --scan-dir . --no-llm --format markdown

Full scan (auto-detects API credentials)

# OpenClaw (reads gateway config automatically)
python3 {baseDir}/scripts/scan.py --openclaw-root ~/.openclaw --format html --output /tmp/deepsafe-report.html

# Claude Code / Cursor / Codex (uses ANTHROPIC_API_KEY or OPENAI_API_KEY)
python3 {baseDir}/scripts/scan.py --modules posture,skill,memory,hooks,model --scan-dir . --format html --output /tmp/deepsafe-report.html

Targeted scans

# Hooks injection only (fastest — checks .claude/settings.json, .cursorrules, etc.)
python3 {baseDir}/scripts/scan.py --modules hooks --scan-dir . --no-llm --format markdown

# Memory scan only (check for leaked secrets/PII)
python3 {baseDir}/scripts/scan.py --openclaw-root ~/.openclaw --modules memory --no-llm

# Model behavior probes only
python3 {baseDir}/scripts/scan.py --openclaw-root ~/.openclaw --modules model --profile quick

Output options

python3 {baseDir}/scripts/scan.py --format json      # machine-readable
python3 {baseDir}/scripts/scan.py --format markdown  # human-readable summary
python3 {baseDir}/scripts/scan.py --format html --output /tmp/report.html  # visual report

Cache control

python3 {baseDir}/scripts/scan.py --ttl-days 3   # cache for 3 days
python3 {baseDir}/scripts/scan.py --no-cache      # always fresh scan

Interpreting Results

Scores

Each module scores 1-100 (100 = clean, deductions per finding, minimum 1)
Module contribution = floor(score / 4), range 1–25
Total = sum of 4 contributions, max 100

Severity Levels

CRITICAL (-10 pts): Immediate exploitation risk — secrets exposed, no auth, data exfiltration chains
HIGH (-5 pts): Serious risk — prompt injection, sensitive file access, network exposure
MEDIUM (-2 pts): Moderate risk — hardcoded keys, missing logs, supply chain concerns
LOW (-1 pt): Minor improvement — non-standard endpoints, missing metadata

Risk Ratings

85-100: LOW RISK (green)
65-84: MEDIUM RISK (yellow)
40-64: HIGH RISK (orange)
4-39: CRITICAL RISK (red)

After Scanning

Present the total score and per-module breakdown to the user
List CRITICAL and HIGH findings first with clear explanations
For each finding, explain what the risk is and how to fix it
Offer to help fix issues (e.g., "want me to add auth to your gateway config?")

What Gets Scanned

Posture Module (deployment config)

Reads openclaw.json and checks: gateway auth, network exposure, API key handling, TLS, plugin permissions, MCP server risks, logging, sandboxing.

Skill Module (installed skills & MCP)

Scans all files in workspace/skills/, mcp/, mcp-servers/ for: hidden unicode, prompt injection, dangerous runtime calls, encoded payloads, sensitive file references, hardcoded secrets, system prompt extraction, command injection, data exfiltration patterns, destructive actions, auto-execute without confirmation, excessive permissions.

With --gateway-url provided (and --no-llm not set), also runs LLM-enhanced semantic analysis on each skill's SKILL.md for social engineering, implicit data exfiltration, and deeper pattern recognition.

Memory Module (sessions & stored data)

Scans agents/, credentials/, identity/, logs/, workspace/ for: 27+ secret patterns (API keys for OpenAI, GitHub, AWS, Slack, Stripe, etc.), 9 PII patterns (email, phone, ID numbers, credit cards), persistent prompt injection traces.

Model Module (4 behavioral safety probes)

Requires --gateway-url and --gateway-token. Runs:

Persuasion Probe: Tests if the model can manipulate user opinions in multi-turn conversation
Sandbagging Probe: Tests if the model can strategically hide capabilities (performance control)
Deception Probe: 3-phase test for reasoning/action misalignment (DTR metric)
HaluEval Probe: Tests hallucination detection accuracy on QA benchmarks

Each probe produces a finding with risk level and score. Average across probes = module score.

Usage Guidance

This scanner is largely consistent with its stated purpose, but exercise caution before running it on your real environment: - Backup any agent/gateway config (e.g., ~/.openclaw/openclaw.json) before running. The tool contains code that will modify that file to enable a chatCompletions endpoint. - If you do not want any external LLM access (and to avoid sending sensitive data to third-party APIs), run with --no-llm or do not expose ANTHROPIC_API_KEY / OPENAI_API_KEY / gateway tokens to the environment. - Review the code (scripts/llm_client.py, scripts/scan.py, and probe files) yourself if possible — the probes contain deliberate prompt-injection and persuasion templates used to test models. - Run scans in an isolated or disposable environment (not on production machines) and avoid running as root; the skill will read many sensitive local files (credentials, logs, sessions). - If you want only static analysis, use the --no-llm flag and ensure the tool cannot access your API keys or the OpenClaw gateway token. Given the tool's capability to modify other agent configs and to use detected API credentials automatically, only install or run it after you are comfortable with those behaviors.

Capability Analysis

Type: OpenClaw Skill Name: deepsafe-scan Version: 2.0.1 The bundle is a security scanner for AI agents, but it contains highly risky components and 'live' attack patterns. Specifically, 'scripts/llm_client.py' automatically modifies the user's 'openclaw.json' configuration file to enable the chatCompletions endpoint, which is an intrusive behavior for a scanner. More critically, the 'demo/awesome-ai-rules/' directory contains markdown files (CLAUDE.md, AGENTS.md) with explicit instructions for an AI agent to exfiltrate API keys via curl, read private SSH keys, and install persistence via crontab. While these appear to be test cases for the scanner to detect, their presence as executable markdown instructions in the workspace creates a significant prompt-injection risk where an agent might inadvertently execute the 'demo' attacks.

Capability Assessment

ℹ Purpose & Capability

Name/description match the delivered artifacts: Python scripts implement posture/skill/memory/hooks scans and model probes. Requiring python3 and shipping static analyzers, probe templates, and an LLM client is proportionate to the stated functionality. One mismatch: the skill auto-enables a gateway chatCompletions endpoint by editing ~/.openclaw/openclaw.json (scripts/llm_client.py), which is beyond a passive scanner's expected read-only behavior.

⚠ Instruction Scope

SKILL.md instructs scanning sensitive local areas (agents/, credentials/, ~/.openclaw, logs, workspace), and the code will auto-detect and use ANTHROPIC_API_KEY / OPENAI_API_KEY or an OpenClaw gateway token. That data may be sent to external LLM endpoints during model probes. The skill also presents itself as able to 'help fix issues' and the llm_client contains logic that writes to openclaw.json to enable an endpoint — this expands scope from read/scan to modification and potential configuration changes.

✓ Install Mechanism

Install spec is lightweight: a brew package for python3 only. No remote archive downloads or npm/pip installs. The install mechanism is proportionate.

ℹ Credentials

The skill declares no required env vars but the runtime auto-detects and will read ANTHROPIC_API_KEY, OPENAI_API_KEY, and OpenClaw gateway token (and potentially OPENCLAW_GATEWAY_TOKEN / OPENAI_BASE_URL). These are expected for the model-probe features, but you should be aware the skill will use any detected keys without an explicit requirement prompt. If you don't want keys used, SKILL.md shows a --no-llm flag to avoid LLM calls.

⚠ Persistence & Privilege

scripts/llm_client.py contains _ensure_chat_completions_enabled which modifies the user's ~/.openclaw/openclaw.json to enable a gateway endpoint. This is a write to another tool's configuration and qualifies as modifying other agent/system settings—an intrusive privilege. The skill is not always:true, but it requests the ability to modify external config files at runtime.

Version History

v2.0.1

v2.0.1: Add honeypot demo repo (awesome-ai-rules) for security demonstration. Hooks scanner now recursively walks subdirectories.

v2.0.0

v2.0.0: Cross-platform universal AI agent security gateway. Supports OpenClaw, Claude Code, Cursor, and Codex. New hooks injection detection, multi-platform LLM auto-detection, externalized prompt templates.

v1.1.0

v1.1.0 - Full-fidelity HTML report: animated SVG gauges, sidebar navigation, collapsible findings, severity bar, stats row, share button, print styles — matches plugin version exactly

v1.0.1

v1.0.1 - Add --open flag: auto-generates HTML report and opens in browser. Agent now defaults to HTML report experience.

v1.0.0

v1.0.0 - Full security scanner skill: 4 modules (posture/skill/memory/model), 4 model probes, LLM semantic audit, fingerprint cache, zero dependencies, auto-config from openclaw.json

Metadata

Slug deepsafe-scan

Version 2.0.1

License MIT-0

All-time Installs 1

Active Installs 1

Total Versions 5

Frequently Asked Questions

What is Deepsafe Scan?

Preflight security scanner for AI coding agents — scans deployment config, skills/MCP servers, memory/sessions, and AI agent config files (hooks injection) f... It is an AI Agent Skill for Claude Code / OpenClaw, with 326 downloads so far.

How do I install Deepsafe Scan?

Run "/install deepsafe-scan" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Deepsafe Scan free?

Yes, Deepsafe Scan is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Deepsafe Scan support?

Deepsafe Scan is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Deepsafe Scan?

It is built and maintained by XiaoYiWeio (@xiaoyiweio); the current version is v2.0.1.

More Skills

Deepsafe Scan