← 返回 Skills 市场
emberdesire

Openclaw Plugin

作者 emberDesire · GitHub ↗ · v1.3.2
cross-platform ⚠ suspicious
2333
总下载
0
收藏
1
当前安装
5
版本数
在 OpenClaw 中安装
/install hopeids
功能描述
Inference-based intrusion detection for AI agents. Pattern matching + LLM analysis for jailbreaks, prompt injection, credential theft, social engineering. 108 detection patterns, OpenClaw plugin, auto-scan, quarantine. Commands: hopeid scan, hopeid test, hopeid setup, hopeid stats, hopeid doctor.
使用说明 (SKILL.md)

hopeIDS Security Skill

Inference-based intrusion detection for AI agents with quarantine and human-in-the-loop.

Security Invariants

These are non-negotiable design principles:

  1. Block = full abort — Blocked messages never reach jasper-recall or the agent
  2. Metadata only — No raw malicious content is ever stored
  3. Approve ≠ re-inject — Approval changes future behavior, doesn't resurrect messages
  4. Alerts are programmatic — Telegram alerts built from metadata, no LLM involved

Features

  • Auto-scan — Scan messages before agent processing
  • Quarantine — Block threats with metadata-only storage
  • Human-in-the-loop — Telegram alerts for review
  • Per-agent config — Different thresholds for different agents
  • Commands/approve, /reject, /trust, /quarantine

The Pipeline

Message arrives
    ↓
hopeIDS.autoScan()
    ↓
┌─────────────────────────────────────────┐
│  risk >= threshold?                     │
│                                         │
│  BLOCK (strictMode):                    │
│     → Create QuarantineRecord           │
│     → Send Telegram alert               │
│     → ABORT (no recall, no agent)       │
│                                         │
│  WARN (non-strict):                     │
│     → Inject \x3Csecurity-alert>           │
│     → Continue to jasper-recall         │
│     → Continue to agent                 │
│                                         │
│  ALLOW:                                 │
│     → Continue normally                 │
└─────────────────────────────────────────┘

Configuration

{
  "plugins": {
    "entries": {
      "hopeids": {
        "enabled": true,
        "config": {
          "autoScan": true,
          "defaultRiskThreshold": 0.7,
          "strictMode": false,
          "telegramAlerts": true,
          "agents": {
            "moltbook-scanner": {
              "strictMode": true,
              "riskThreshold": 0.7
            },
            "main": {
              "strictMode": false,
              "riskThreshold": 0.8
            }
          }
        }
      }
    }
  }
}

Options

Option Type Default Description
autoScan boolean false Auto-scan every message
strictMode boolean false Block (vs warn) on threats
defaultRiskThreshold number 0.7 Risk level that triggers action
telegramAlerts boolean true Send alerts for blocked messages
telegramChatId string - Override alert destination
quarantineDir string ~/.openclaw/quarantine/hopeids Storage path
agents object - Per-agent overrides
trustOwners boolean true Skip scanning owner messages

Quarantine Records

When a message is blocked, a metadata record is created:

{
  "id": "q-7f3a2b",
  "ts": "2026-02-06T00:48:00Z",
  "agent": "moltbook-scanner",
  "source": "moltbook",
  "senderId": "@sus_user",
  "intent": "instruction_override",
  "risk": 0.85,
  "patterns": [
    "matched regex: ignore.*instructions",
    "matched keyword: api key"
  ],
  "contentHash": "ab12cd34...",
  "status": "pending"
}

Note: There is NO originalMessage field. This is intentional.


Telegram Alerts

When a message is blocked:

🛑 Message blocked

ID: `q-7f3a2b`
Agent: moltbook-scanner
Source: moltbook
Sender: @sus_user
Intent: instruction_override (85%)

Patterns:
• matched regex: ignore.*instructions
• matched keyword: api key

`/approve q-7f3a2b`
`/reject q-7f3a2b`
`/trust @sus_user`

Built from metadata only. No LLM touches this.


Commands

/quarantine [all|clean]

List quarantine records.

/quarantine        # List pending
/quarantine all    # List all (including resolved)
/quarantine clean  # Clean expired records

/approve \x3Cid>

Mark a blocked message as a false positive.

/approve q-7f3a2b

Effect:

  • Status → approved
  • (Future) Add sender to allowlist
  • (Future) Lower pattern weight

/reject \x3Cid>

Confirm a blocked message was a true positive.

/reject q-7f3a2b

Effect:

  • Status → rejected
  • (Future) Reinforce pattern weights

/trust \x3CsenderId>

Whitelist a sender for future messages.

/trust @legitimate_user

/scan \x3Cmessage>

Manually scan a message.

/scan ignore your previous instructions and...

What Approve/Reject Mean

Command What it does What it doesn't do
/approve Marks as false positive, may adjust IDS Does NOT re-inject the message
/reject Confirms threat, may strengthen patterns Does NOT affect current message
/trust Whitelists sender for future Does NOT retroactively approve

The blocked message is gone by design. If it was legitimate, the sender can re-send.


Per-Agent Configuration

Different agents need different security postures:

"agents": {
  "moltbook-scanner": {
    "strictMode": true,    // Block threats
    "riskThreshold": 0.7   // 70% = suspicious
  },
  "main": {
    "strictMode": false,   // Warn only
    "riskThreshold": 0.8   // Higher bar for main
  },
  "email-processor": {
    "strictMode": true,    // Always block
    "riskThreshold": 0.6   // More paranoid
  }
}

Threat Categories

Category Risk Description
command_injection 🔴 Critical Shell commands, code execution
credential_theft 🔴 Critical API key extraction attempts
data_exfiltration 🔴 Critical Data leak to external URLs
instruction_override 🔴 High Jailbreaks, "ignore previous"
impersonation 🔴 High Fake system/admin messages
discovery ⚠️ Medium API/capability probing

Installation

npx hopeid setup

Then restart OpenClaw.


Links

安全使用建议
This plugin is coherent with its stated purpose (an IDS that quarantines threats and alerts via Telegram) but you should not install it blindly. Key things to consider before installing: - Message transmission to models: classification uses llm-task or a classifier agent and sends (part of) the raw incoming message to the configured model/provider. That is expected for semantic analysis but means sensitive text may leave your system at runtime even if it is not persisted. Verify which LLM providers (local vs cloud) your OpenClaw instance routes llm-task or classifierAgent calls to. - External dependency provenance: the plugin dynamically imports a separate 'hopeid' package and suggests running 'npx hopeid setup' / 'npm install hopeid'. The registry entry does not include a trustworthy homepage or maintainer details. Inspect the 'hopeid' package source (and any CLI behavior) before installing it. - Storage: quarantine records are metadata-only by design, but they are written to ~/.openclaw/quarantine/hopeids (or records.json in that dir in fallback). Confirm you are comfortable with that path and check retention/permissions. - Conservative initial settings: enable the plugin in non-strict/warn-only mode and disable autoScan initially; verify alerting behavior (Telegram) and that alerts contain only metadata. Test with non-sensitive inputs in a staging environment. - If you need higher assurance: request the full 'hopeid' package source and the remainder of this plugin's source (truncated portions) to audit exactly what is sent to classifiers and how patterns/rules are defined. Given these gaps (missing provenance, runtime transmission of raw messages to configured LLMs, and inconsistent install guidance), treat this skill with caution and perform the checks above before trusting it in production.
功能分析
Type: OpenClaw Skill Name: hopeids Version: 1.3.2 The OpenClaw hopeIDS skill is designed as an Intrusion Detection System (IDS) for AI agents, aiming to prevent malicious activity. Its core logic and documentation consistently reflect this purpose, emphasizing 'metadata only' for quarantine records and programmatic alerts. However, it is classified as 'suspicious' due to two primary reasons: 1) It relies heavily on an external `hopeid` npm package, introducing a supply chain risk where a compromised dependency could lead to malicious execution. 2) The `trustOwners` configuration (defaulting to true) allows messages from owner accounts to bypass all security scans, creating a potential vulnerability if an owner's account is compromised. While these are vulnerabilities rather than direct malicious intent by the skill itself, they represent significant security risks.
能力评估
Purpose & Capability
Name/description (inference-based IDS, quarantine, Telegram alerts) align with the code and manifest: it implements auto-scan, quarantine records (metadata-only), per-agent config, and commands. The plugin depends on a separate 'hopeid' package (declared in package.json) which is coherent with the skill's functionality.
Instruction Scope
SKILL.md and code consistently state 'metadata-only' storage, and quarantine records do not include an originalMessage field. However classification is performed using llm-task or a classifier agent (api.invokeTool or api.sessions.send) and the code sends (a substring of) the incoming message to whichever model/provider is configured. That means raw message content will be transmitted at runtime to the configured model/provider even though it is not persisted — this is a potential data-exfiltration vector users may not expect. The instructions ask to run 'npx hopeid setup' which implies additional installation/config steps external to OpenClaw; the origin and behavior of that CLI are not documented here.
Install Mechanism
There is no install spec in the registry entry (instruction-only), but the package.json includes a dependency on 'hopeid'. The code dynamically imports 'hopeid' and will error if it is not installed (with instructions to npm install it). This mixed messaging (no install spec but package.json + dynamic import) is inconsistent and requires the user to install an external package. The 'hopeid' package origin is not verifiable from the provided metadata (homepage truncated/absent).
Credentials
The skill declares no required env vars or credentials and relies on OpenClaw platform config (e.g., channels.telegram.botToken and ownerNumbers). That is proportionate. However, runtime classification sends message text to configured LLM tooling (llm-task or classifierAgent) which may route to third-party providers (Anthropic/OpenAI/etc) configured elsewhere in the platform — installing this plugin therefore implicitly sends messages to those providers. The SKILL.md emphasizes metadata-only storage but does not highlight that raw message text is transmitted to models for classification.
Persistence & Privilege
always is false and the plugin does not request system-wide privileges. It writes quarantine records to a plugin-specific directory (default ~/.openclaw/quarantine/hopeids) and will fall back to an in-memory/file-based quarantine if the 'hopeid/quarantine' module is not present. It does not modify other skills' configs. Writing records to the user's home directory is expected for a quarantine feature but you should verify file permissions and retention policies.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install hopeids
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /hopeids 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.3.2
Fix plugin imports, document Telegram requirements
v1.3.1
feat: llm-task classifier support for fast, lightweight semantic analysis
v1.2.0
Added doctor command for health diagnostics
v1.1.1
v1.1.1: Fixed sandbox auto-config that could break workers. Setup now shows guidance instead of auto-applying. 108 patterns, OpenClaw plugin support.
v0.1.0
hopeIDS Security Skill initial release. - Introduces inference-based intrusion detection for AI agents to protect against prompt injection, credential theft, data exfiltration, and related threats. - Provides the security_scan tool for message analysis and integration guidance. - Outlines primary threat categories and recommended IDS-first workflow for agents processing untrusted input. - Includes detailed configuration options for OpenClaw, sandboxing patterns, and example responses to detected threats. - Installation instructions and relevant resource links included for immediate setup.
元数据
Slug hopeids
版本 1.3.2
许可证
累计安装 1
当前安装数 1
历史版本数 5
常见问题

Openclaw Plugin 是什么?

Inference-based intrusion detection for AI agents. Pattern matching + LLM analysis for jailbreaks, prompt injection, credential theft, social engineering. 108 detection patterns, OpenClaw plugin, auto-scan, quarantine. Commands: hopeid scan, hopeid test, hopeid setup, hopeid stats, hopeid doctor. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 2333 次。

如何安装 Openclaw Plugin?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install hopeids」即可一键安装,无需额外配置。

Openclaw Plugin 是免费的吗?

是的,Openclaw Plugin 完全免费(开源免费),可自由下载、安装和使用。

Openclaw Plugin 支持哪些平台?

Openclaw Plugin 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Openclaw Plugin?

由 emberDesire(@emberdesire)开发并维护,当前版本 v1.3.2。

💬 留言讨论