功能描述

Use this skill when the user asks about securing their OpenClaw installation, configuring AI agents safely, understanding prompt injection risks, dealing wit...

使用说明 (SKILL.md)

Merlin Security Sentinel — Agentic Security Framework

Name: merlin-security-sentinel
Author: thepoorsatitagain

When to use this skill

Load this skill when the user is concerned about:

Credential theft or exfiltration by AI agents
Prompt injection attacks via messages, files, or web content
Malicious skills in the ClawHub registry
Safe configuration of privileged systems using AI
Understanding what persistent agents can and cannot safely do
Building governed, auditable AI workflows

The Core Problem with Persistent AI Agents

Persistent AI agents — including this one — carry structural security liabilities that are not fixable by configuration alone.

Three risks compound each other:

Credential accumulation — A persistent agent builds up an increasingly detailed model of credentials, tokens, and system access over time. Any compromise of the agent's memory or storage exposes that accumulated access.
Memory poisoning — A persistent agent's memory (SOUL.md, MEMORY.md, IDENTITY.md) can be modified by malicious skills or prompt injection. Modified memory causes the agent to follow attacker instructions in future sessions with no single triggering event detectable.
Supply chain attacks — The ClawHub registry has documented malicious skills. Research in Q1 2026 found 820+ malicious skills out of ~10,700 analyzed. 26% of 31,000 analyzed skills contained at least one vulnerability.

Security research findings (Q1 2026):

40,000+ internet-exposed OpenClaw instances identified
~35% flagged as vulnerable
15,000+ susceptible to remote code execution
CVE-2026-25253: a single malicious link click grants full gateway control
Microsoft classified persistent self-hosted AI agents as "untrusted code execution with persistent credentials"

Immediate Hardening Steps

1. Lock your memory files

chmod 444 ~/.openclaw/workspace/SOUL.md
chmod 444 ~/.openclaw/workspace/MEMORY.md
chmod 444 ~/.openclaw/workspace/IDENTITY.md

2. Restrict tool permissions

Set the most restrictive tool profile compatible with your actual use:

tools.profile: "messaging" — no exec
Never enable exec unless specifically needed
Never use tools.allow: ["*"]

3. Bind to localhost only

openclaw gateway --port 18789 --host 127.0.0.1

4. Use allowlists for DMs

Set explicit allowedDMs rather than ["*"]. Any user who can message a shared tool-enabled agent can steer it within its granted permissions.

5. Audit installed skills

clawhub list

Check SKILL.md files manually. Look for: base64 encoding, external downloads, instructions to modify SOUL.md or MEMORY.md.

The Architectural Answer

For tasks involving elevated privilege the structurally correct answer is ephemeral execution, not hardened persistence.

Two inviolable axioms:

No AI shall see its own configuration — The execution envelope is applied at container infrastructure level, not delivered to the model. An agent that cannot inspect its own constraints cannot reason about circumventing them.
No AI that has touched privileged systems shall persist — Container termination is total. Not paused. Destroyed. The agent's knowledge of your system dies with the container.

What persists: A signed, replayable audit record of exactly what the AI did — held outside the container, inaccessible to the AI.

What does not persist: Credentials, session memory, system knowledge, the agent itself.

When to use ephemeral execution vs persistent agents

Task	Use
Daily messaging, reminders, search	Persistent (acceptable risk)
Configuring your own AI agents	Ephemeral — high risk to persist
Setting up new systems	Ephemeral — involves credentials
Running security scans	Ephemeral — agent sees sensitive data
Installing/updating privileged software	Ephemeral — credential entry involved

Prompt Injection Defense

OpenClaw's security model explicitly states that prompt injection is out of scope as a vulnerability — the framework cannot prevent it at the infrastructure level.

Practical defenses:

Never enable exec when browsing untrusted content
Use separate sessions for untrusted content and credential-sensitive tasks
Treat all content from messaging channels as untrusted
The architectural fix is an ingress firewall that makes external content readable but never instruction-authoritative — runtime filtering alone is insufficient

Architecture Reference

The full governed architecture — execution envelopes, ephemeral containers, deterministic audit trails, governed knowledge retrieval — is documented and prototyped at:

Threat assessment: github.com/thepoorsatitagain/OPENCLAW_SECURITY_THREAT_ASSESSMENT3
Hydra Kernel / GEL: github.com/thepoorsatitagain/Ai-control-2 — provisional patent 63/939,121
Merlin ephemeral sentinel: github.com/thepoorsatitagain/Merlin-agenic-security-airgapper
Working wrapper prototype: github.com/thepoorsatitagain/working-project-openclaw-wrapper

Quick Reference

"Is OpenClaw safe?" For daily personal use with minimal tool access and no exec: acceptable risk. For anything involving credentials, privileged systems, or shared access: the structural risks are real and documented.

"I got a suspicious skill installed"

Check SOUL.md, MEMORY.md, IDENTITY.md for injected content
Revoke any credentials the agent had access to
clawhub uninstall \x3Cskill-slug>
Review audit logs
Consider clean reinstall if memory files were modified

"What is the worst case?" CVE-2026-25253: one malicious link click, full gateway RCE within milliseconds. Agent exfiltrates SOUL.md, MEMORY.md, device.json, openclaw.json, browser session tokens, SSH credentials. Future sessions follow attacker instructions silently.

安全使用建议

This skill is coherent and behaves like a reference/manual rather than code. Before acting on its commands: (1) Back up SOUL.md, MEMORY.md, IDENTITY.md — making them read-only (chmod 444) can prevent legitimate agent updates and may break workflows. Test changes in a non-production environment first. (2) Verify the exact filesystem paths on your host before running chmod or gateway commands; adjust if your OpenClaw workspace is elsewhere. (3) Vet external GitHub links and repositories before cloning or executing any code (the SKILL.md points to personal/third-party repos). (4) Manual auditing advice (inspect SKILL.md for base64/external download steps) is sound — if you lack expertise, get a trusted admin to review. (5) If you need airtight protection for privileged operations, follow the ephemeral-container/audit-trail approach recommended here; it is more disruptive but structurally safer. Overall: the guidance is useful and aligned with its stated purpose, but apply it cautiously and test changes first.

功能分析

Type: OpenClaw Skill Name: merlin-security-sentinel Version: 1.0.0 The Merlin Security Sentinel skill is a purely informational security advisory and hardening guide for OpenClaw users. It provides legitimate defensive recommendations, such as restricting file permissions (chmod 444 on memory files), limiting tool profiles, and using ephemeral execution for sensitive tasks. While it references external GitHub repositories (e.g., github.com/thepoorsatitagain/Ai-control-2) and hypothetical future security research, it contains no executable code, exfiltration logic, or malicious instructions designed to compromise the agent or the host system.

能力评估

✓ Purpose & Capability

The skill's name and description (OpenClaw/agent hardening, prompt-injection defense, credential protection, ephemeral execution) match the content of SKILL.md. It asks for no credentials, binaries, or installs, so there is no disproportionate access request for the stated purpose.

ℹ Instruction Scope

The SKILL.md stays on-topic (hardening, audit steps, architecture recommendations). It contains concrete system commands (chmod on ~/.openclaw workspace files, running openclaw gateway with a host bind, clawhub list) and links to external GitHub repos. These actions are relevant but potentially disruptive: making memory files read-only (chmod 444) can break intended agent behavior or updates and should be tested in a safe environment after backing up files. External links should be vetted before cloning or running code from them.

✓ Install Mechanism

No install spec and no code files — instruction-only. This is the lowest-risk install model; nothing will be written to disk by an automated installer.

✓ Credentials

The skill requests no environment variables, no credentials, and no config paths as part of the skill metadata. The runtime instructions reference user-local paths that are relevant to OpenClaw memory/config, which is proportional to the guidance being offered.

✓ Persistence & Privilege

always is false and the skill does not request permanent presence or elevated platform privileges. The guidance actually recommends avoiding persistence for privileged tasks (ephemeral containers), which reduces privilege concerns.

版本历史

v1.0.0

- Initial release of merlin-security-sentinel, providing guidance on securing OpenClaw installations and configuring AI agents safely. - Documents risks associated with persistent AI agents: credential accumulation, memory poisoning, and supply chain attacks. - Offers immediate hardening steps, including file permissions, restricted tool profiles, binding to localhost, allowlists, and skill audits. - Emphasizes ephemeral execution for privileged tasks and explains architectural principles for agentic security. - Lists practical defenses against prompt injection and clarifies the limitations of OpenClaw's security model. - Includes quick reference answers to common OpenClaw security questions and incident responses.

元数据

Slug merlin-security-sentinel

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

merlin-security-sentinel 是什么？

Use this skill when the user asks about securing their OpenClaw installation, configuring AI agents safely, understanding prompt injection risks, dealing wit... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 122 次。

如何安装 merlin-security-sentinel？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install merlin-security-sentinel」即可一键安装，无需额外配置。

merlin-security-sentinel 是免费的吗？

是的，merlin-security-sentinel 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

merlin-security-sentinel 支持哪些平台？

merlin-security-sentinel 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 merlin-security-sentinel？

由 thepoorsatitagain（@thepoorsatitagain）开发并维护，当前版本 v1.0.0。

merlin-security-sentinel