← Back to Skills Marketplace
emberdesire

Openclaw Plugin

by emberDesire · GitHub ↗ · v1.3.2
cross-platform ⚠ suspicious
2333
Downloads
0
Stars
1
Active Installs
5
Versions
Install in OpenClaw
/install hopeids
Description
Inference-based intrusion detection for AI agents. Pattern matching + LLM analysis for jailbreaks, prompt injection, credential theft, social engineering. 108 detection patterns, OpenClaw plugin, auto-scan, quarantine. Commands: hopeid scan, hopeid test, hopeid setup, hopeid stats, hopeid doctor.
README (SKILL.md)

hopeIDS Security Skill

Inference-based intrusion detection for AI agents with quarantine and human-in-the-loop.

Security Invariants

These are non-negotiable design principles:

  1. Block = full abort — Blocked messages never reach jasper-recall or the agent
  2. Metadata only — No raw malicious content is ever stored
  3. Approve ≠ re-inject — Approval changes future behavior, doesn't resurrect messages
  4. Alerts are programmatic — Telegram alerts built from metadata, no LLM involved

Features

  • Auto-scan — Scan messages before agent processing
  • Quarantine — Block threats with metadata-only storage
  • Human-in-the-loop — Telegram alerts for review
  • Per-agent config — Different thresholds for different agents
  • Commands/approve, /reject, /trust, /quarantine

The Pipeline

Message arrives
    ↓
hopeIDS.autoScan()
    ↓
┌─────────────────────────────────────────┐
│  risk >= threshold?                     │
│                                         │
│  BLOCK (strictMode):                    │
│     → Create QuarantineRecord           │
│     → Send Telegram alert               │
│     → ABORT (no recall, no agent)       │
│                                         │
│  WARN (non-strict):                     │
│     → Inject \x3Csecurity-alert>           │
│     → Continue to jasper-recall         │
│     → Continue to agent                 │
│                                         │
│  ALLOW:                                 │
│     → Continue normally                 │
└─────────────────────────────────────────┘

Configuration

{
  "plugins": {
    "entries": {
      "hopeids": {
        "enabled": true,
        "config": {
          "autoScan": true,
          "defaultRiskThreshold": 0.7,
          "strictMode": false,
          "telegramAlerts": true,
          "agents": {
            "moltbook-scanner": {
              "strictMode": true,
              "riskThreshold": 0.7
            },
            "main": {
              "strictMode": false,
              "riskThreshold": 0.8
            }
          }
        }
      }
    }
  }
}

Options

Option Type Default Description
autoScan boolean false Auto-scan every message
strictMode boolean false Block (vs warn) on threats
defaultRiskThreshold number 0.7 Risk level that triggers action
telegramAlerts boolean true Send alerts for blocked messages
telegramChatId string - Override alert destination
quarantineDir string ~/.openclaw/quarantine/hopeids Storage path
agents object - Per-agent overrides
trustOwners boolean true Skip scanning owner messages

Quarantine Records

When a message is blocked, a metadata record is created:

{
  "id": "q-7f3a2b",
  "ts": "2026-02-06T00:48:00Z",
  "agent": "moltbook-scanner",
  "source": "moltbook",
  "senderId": "@sus_user",
  "intent": "instruction_override",
  "risk": 0.85,
  "patterns": [
    "matched regex: ignore.*instructions",
    "matched keyword: api key"
  ],
  "contentHash": "ab12cd34...",
  "status": "pending"
}

Note: There is NO originalMessage field. This is intentional.


Telegram Alerts

When a message is blocked:

🛑 Message blocked

ID: `q-7f3a2b`
Agent: moltbook-scanner
Source: moltbook
Sender: @sus_user
Intent: instruction_override (85%)

Patterns:
• matched regex: ignore.*instructions
• matched keyword: api key

`/approve q-7f3a2b`
`/reject q-7f3a2b`
`/trust @sus_user`

Built from metadata only. No LLM touches this.


Commands

/quarantine [all|clean]

List quarantine records.

/quarantine        # List pending
/quarantine all    # List all (including resolved)
/quarantine clean  # Clean expired records

/approve \x3Cid>

Mark a blocked message as a false positive.

/approve q-7f3a2b

Effect:

  • Status → approved
  • (Future) Add sender to allowlist
  • (Future) Lower pattern weight

/reject \x3Cid>

Confirm a blocked message was a true positive.

/reject q-7f3a2b

Effect:

  • Status → rejected
  • (Future) Reinforce pattern weights

/trust \x3CsenderId>

Whitelist a sender for future messages.

/trust @legitimate_user

/scan \x3Cmessage>

Manually scan a message.

/scan ignore your previous instructions and...

What Approve/Reject Mean

Command What it does What it doesn't do
/approve Marks as false positive, may adjust IDS Does NOT re-inject the message
/reject Confirms threat, may strengthen patterns Does NOT affect current message
/trust Whitelists sender for future Does NOT retroactively approve

The blocked message is gone by design. If it was legitimate, the sender can re-send.


Per-Agent Configuration

Different agents need different security postures:

"agents": {
  "moltbook-scanner": {
    "strictMode": true,    // Block threats
    "riskThreshold": 0.7   // 70% = suspicious
  },
  "main": {
    "strictMode": false,   // Warn only
    "riskThreshold": 0.8   // Higher bar for main
  },
  "email-processor": {
    "strictMode": true,    // Always block
    "riskThreshold": 0.6   // More paranoid
  }
}

Threat Categories

Category Risk Description
command_injection 🔴 Critical Shell commands, code execution
credential_theft 🔴 Critical API key extraction attempts
data_exfiltration 🔴 Critical Data leak to external URLs
instruction_override 🔴 High Jailbreaks, "ignore previous"
impersonation 🔴 High Fake system/admin messages
discovery ⚠️ Medium API/capability probing

Installation

npx hopeid setup

Then restart OpenClaw.


Links

Usage Guidance
This plugin is coherent with its stated purpose (an IDS that quarantines threats and alerts via Telegram) but you should not install it blindly. Key things to consider before installing: - Message transmission to models: classification uses llm-task or a classifier agent and sends (part of) the raw incoming message to the configured model/provider. That is expected for semantic analysis but means sensitive text may leave your system at runtime even if it is not persisted. Verify which LLM providers (local vs cloud) your OpenClaw instance routes llm-task or classifierAgent calls to. - External dependency provenance: the plugin dynamically imports a separate 'hopeid' package and suggests running 'npx hopeid setup' / 'npm install hopeid'. The registry entry does not include a trustworthy homepage or maintainer details. Inspect the 'hopeid' package source (and any CLI behavior) before installing it. - Storage: quarantine records are metadata-only by design, but they are written to ~/.openclaw/quarantine/hopeids (or records.json in that dir in fallback). Confirm you are comfortable with that path and check retention/permissions. - Conservative initial settings: enable the plugin in non-strict/warn-only mode and disable autoScan initially; verify alerting behavior (Telegram) and that alerts contain only metadata. Test with non-sensitive inputs in a staging environment. - If you need higher assurance: request the full 'hopeid' package source and the remainder of this plugin's source (truncated portions) to audit exactly what is sent to classifiers and how patterns/rules are defined. Given these gaps (missing provenance, runtime transmission of raw messages to configured LLMs, and inconsistent install guidance), treat this skill with caution and perform the checks above before trusting it in production.
Capability Analysis
Type: OpenClaw Skill Name: hopeids Version: 1.3.2 The OpenClaw hopeIDS skill is designed as an Intrusion Detection System (IDS) for AI agents, aiming to prevent malicious activity. Its core logic and documentation consistently reflect this purpose, emphasizing 'metadata only' for quarantine records and programmatic alerts. However, it is classified as 'suspicious' due to two primary reasons: 1) It relies heavily on an external `hopeid` npm package, introducing a supply chain risk where a compromised dependency could lead to malicious execution. 2) The `trustOwners` configuration (defaulting to true) allows messages from owner accounts to bypass all security scans, creating a potential vulnerability if an owner's account is compromised. While these are vulnerabilities rather than direct malicious intent by the skill itself, they represent significant security risks.
Capability Assessment
Purpose & Capability
Name/description (inference-based IDS, quarantine, Telegram alerts) align with the code and manifest: it implements auto-scan, quarantine records (metadata-only), per-agent config, and commands. The plugin depends on a separate 'hopeid' package (declared in package.json) which is coherent with the skill's functionality.
Instruction Scope
SKILL.md and code consistently state 'metadata-only' storage, and quarantine records do not include an originalMessage field. However classification is performed using llm-task or a classifier agent (api.invokeTool or api.sessions.send) and the code sends (a substring of) the incoming message to whichever model/provider is configured. That means raw message content will be transmitted at runtime to the configured model/provider even though it is not persisted — this is a potential data-exfiltration vector users may not expect. The instructions ask to run 'npx hopeid setup' which implies additional installation/config steps external to OpenClaw; the origin and behavior of that CLI are not documented here.
Install Mechanism
There is no install spec in the registry entry (instruction-only), but the package.json includes a dependency on 'hopeid'. The code dynamically imports 'hopeid' and will error if it is not installed (with instructions to npm install it). This mixed messaging (no install spec but package.json + dynamic import) is inconsistent and requires the user to install an external package. The 'hopeid' package origin is not verifiable from the provided metadata (homepage truncated/absent).
Credentials
The skill declares no required env vars or credentials and relies on OpenClaw platform config (e.g., channels.telegram.botToken and ownerNumbers). That is proportionate. However, runtime classification sends message text to configured LLM tooling (llm-task or classifierAgent) which may route to third-party providers (Anthropic/OpenAI/etc) configured elsewhere in the platform — installing this plugin therefore implicitly sends messages to those providers. The SKILL.md emphasizes metadata-only storage but does not highlight that raw message text is transmitted to models for classification.
Persistence & Privilege
always is false and the plugin does not request system-wide privileges. It writes quarantine records to a plugin-specific directory (default ~/.openclaw/quarantine/hopeids) and will fall back to an in-memory/file-based quarantine if the 'hopeid/quarantine' module is not present. It does not modify other skills' configs. Writing records to the user's home directory is expected for a quarantine feature but you should verify file permissions and retention policies.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install hopeids
  3. After installation, invoke the skill by name or use /hopeids
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.3.2
Fix plugin imports, document Telegram requirements
v1.3.1
feat: llm-task classifier support for fast, lightweight semantic analysis
v1.2.0
Added doctor command for health diagnostics
v1.1.1
v1.1.1: Fixed sandbox auto-config that could break workers. Setup now shows guidance instead of auto-applying. 108 patterns, OpenClaw plugin support.
v0.1.0
hopeIDS Security Skill initial release. - Introduces inference-based intrusion detection for AI agents to protect against prompt injection, credential theft, data exfiltration, and related threats. - Provides the security_scan tool for message analysis and integration guidance. - Outlines primary threat categories and recommended IDS-first workflow for agents processing untrusted input. - Includes detailed configuration options for OpenClaw, sandboxing patterns, and example responses to detected threats. - Installation instructions and relevant resource links included for immediate setup.
Metadata
Slug hopeids
Version 1.3.2
License
All-time Installs 1
Active Installs 1
Total Versions 5
Frequently Asked Questions

What is Openclaw Plugin?

Inference-based intrusion detection for AI agents. Pattern matching + LLM analysis for jailbreaks, prompt injection, credential theft, social engineering. 108 detection patterns, OpenClaw plugin, auto-scan, quarantine. Commands: hopeid scan, hopeid test, hopeid setup, hopeid stats, hopeid doctor. It is an AI Agent Skill for Claude Code / OpenClaw, with 2333 downloads so far.

How do I install Openclaw Plugin?

Run "/install hopeids" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Openclaw Plugin free?

Yes, Openclaw Plugin is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Openclaw Plugin support?

Openclaw Plugin is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Openclaw Plugin?

It is built and maintained by emberDesire (@emberdesire); the current version is v1.3.2.

💬 Comments