← 返回 Skills 市场

Lerwee Alert To Fault Handling

Name: Lerwee Alert To Fault Handling
Author: lerwee

作者 Lerwee · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ⚠ suspicious

509

总下载

当前安装

版本数

在 OpenClaw 中安装

/install alert-to-fault-handling

功能描述

告警自动处理工作流 - 监听告警上下文，匹配处理脚本，提示用户执行故障处理操作

使用说明 (SKILL.md)

告警自动处理工作流

Overview

当检测到告警上下文时，根据群组和告警类型自动匹配对应的故障处理脚本，提示用户一键执行。

工作流程

告警上下文检测 → 群组识别 → 关键词匹配 → 脚本推荐 → 用户确认 → 执行脚本 → 反馈结果 → (可选)自动关闭告警

触发条件

当以下条件同时满足时触发：

对话上下文存在告警信息（包含 eventid、IP、告警名称中的至少一个）
当前飞书群组与告警分类匹配
告警内容匹配脚本关键词
存在对应的预置脚本

配置文件

脚本映射配置 (.scripts_map.json)

{
  "nginx": {
    "name": "nginx服务重启",
    "script_id": 187,
    "keywords": ["nginx", "Nginx", "NGINX", "web", "80端口", "http服务"],
    "classifications": [102],
    "chat_groups": [""],
    "description": "重启Nginx服务，适用于服务停止、无响应等场景"
  },
  "disk": {
    "name": "主机磁盘空间清理",
    "script_id": 197,
    "keywords": ["磁盘", "disk", "空间", "storage", "/var", "/tmp", "使用率", "满"],
    "classifications": [101],
    "chat_groups": [""],
    "description": "清理日志文件和临时文件，释放磁盘空间"
  }
}

执行日志配置 (.execution_log.json)

{
  "executions": [
    {
      "timestamp": "2026-03-10T09:50:00Z",
      "eventid": "32415666",
      "ip": "192.168.3.137",
      "script_id": 187,
      "script_name": "nginx服务重启",
      "status": "success",
      "execution_id": 970,
      "user": ""
    }
  ]
}

群组与分类映射

群组名称	监控分类	默认脚本
操作系统告警群	101	主机磁盘空间清理 (197)
中间件告警群	102	nginx服务重启 (187)
网络设备告警群	103	(待配置)

用户交互

场景1: 自动推荐

🤖 检测到 Nginx 服务停止告警
📊 告警对象: 3.137-Nginx-1.14.2 (192.168.3.137)
🔑 告警ID: 32415666

💡 推荐操作: nginx服务重启 (脚本ID: 187)
📝 说明: 重启Nginx服务，适用于服务停止、无响应等场景

👉 回复「执行」或「确认」自动运行脚本

场景2: 用户指定脚本

用户: 执行脚本 197

🔧 正在为主机 192.168.3.137 执行脚本...
📋 脚本: 主机磁盘空间清理 (ID: 197)
⏳ 执行中...

场景3: 确认后执行

用户: 确认

🔧 提交执行任务...
✅ 任务已提交 (Execution ID: 970)
⏳ 等待执行结果...

IP 获取规则

优先级从高到低：

告警消息中的 IP 字段（最优先）

通过 objectid 查询主机详情获取 IP

./scripts/lerwee-api.sh monitor host-view '{"hostid": 11131}'

用户手动指定

脚本执行

使用 fault-handling skill 执行脚本：

python3 /home/node/.openclaw/workspace/skills/fault-handling/run_script.py \
  --hosts '192.168.3.137' \
  --script-id 187

执行结果反馈

🔧 故障处理执行报告
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
任务名称：nginx服务重启
任务ID：970
执行状态：成功
执行耗时：15秒

📋 步骤执行详情
步骤              主机IP        主机名      状态    输出
nginx服务重启     192.168.3.137  vm-3-137   ✅     nginx: [  OK  ]

📌 执行结论
● ✅ 1台主机执行成功
● ❌ 0台主机执行失败

💾 已记录到执行日志

可选: 自动关闭告警

脚本执行成功后，可选自动关闭对应告警：

./scripts/lerwee-api.sh alert problem-ack '{
  "eventid": "32415666",
  "action": 1,
  "message": "脚本执行成功，自动关闭告警"
}'

Hard Rules

执行脚本前必须获取用户明确确认（回复「执行」「确认」「yes」）
不能在没有主机 IP 的情况下猜测或使用占位 IP
执行失败时必须明确说明失败原因
所有执行必须记录到日志文件
自动关闭告警前必须确认脚本执行成功

Files

配置: .scripts_map.json - 脚本映射配置
日志: .execution_log.json - 执行历史记录
主逻辑: 由 Agent 动态处理，无需独立脚本

扩展新脚本类型

在 .scripts_map.json 中添加新条目：

{
  "mysql": {
    "name": "MySQL服务重启",
    "script_id": 198,
    "keywords": ["mysql", "MySQL", "数据库"],
    "classifications": [105],
    "chat_groups": ["oc_xxx"],
    "description": "重启MySQL服务"
  }
}

安全机制

确认机制: 默认脚本需用户确认，自定义脚本ID需二次确认
白名单: 只执行预置脚本或用户明确指定的脚本ID
日志审计: 所有操作记录到日志文件
回滚支持: 记录执行ID，支持查询历史结果

安全使用建议

This skill will execute local helper scripts to run remediation actions (restart services, clean disks, call your alert API). Before installing or enabling it: 1) Verify the external helpers it calls (fault-handling/run_script.py and lerwee-api.sh) exist in the expected locations and review their code — those scripts perform the actual actions and may access network/hosts. 2) Confirm the confirmation requirement is enforced by your agent workflow (the Python code does not force interactive confirmation). 3) Audit the .scripts_map.json entries and chat_group IDs to ensure only approved scripts can be triggered and that script IDs map to safe, whitelisted operations. 4) Test in a staging environment first (to avoid accidental service restarts or alert closures). If you can't inspect the external helper scripts or cannot guarantee the agent will require explicit human confirmation, do not enable this skill in production.

功能分析

Type: OpenClaw Skill Name: alert-to-fault-handling Version: 1.0.0 The skill bundle implements an automated IT alert remediation workflow that matches system alerts to specific recovery scripts based on keywords and classifications. The logic in 'alert_workflow.py' and 'match_script.py' is consistent with the stated purpose of AIOps automation, and 'SKILL.md' explicitly mandates a human-in-the-loop confirmation ('执行' or '确认') before any action is taken. The code uses safe subprocess calls without shell=True, maintains local execution logs in '.execution_log.json', and shows no signs of data exfiltration or malicious intent.

能力评估

ℹ Purpose & Capability

The name/description (alert → fault handling) matches what the included Python scripts do: detect alert context, match a script, log execution, call a fault-handling runner, and optionally acknowledge alerts. However the skill references external components (a 'fault-handling' run_script.py and a 'lerwee-api.sh' script) that are not bundled — this is plausible if the environment provides them, but it is an unstated dependency and should be confirmed.

⚠ Instruction Scope

SKILL.md and the scripts direct the agent to call local scripts that perform network/API operations and service actions (e.g., calling lerwee-api.sh and running fault-handling/run_script.py). The SKILL.md claims user confirmation is required before execution, but enforcement is left to the agent/operator (the code does not itself gate executions beyond the command interface). There are small inconsistencies between example paths in SKILL.md (absolute /home/node/... and ./scripts/lerwee-api.sh) and the code's relative paths, which may cause failures or unexpected behavior if the expected helper scripts are missing or differ.

ℹ Install Mechanism

No install spec (instruction-only) and no external downloads — low install risk. But the package actually includes runnable Python files (.py) and JSON config/log files, so although there's no installer, code will run if invoked. Confirm where these files will be stored and what other local scripts they call.

✓ Credentials

The skill requests no environment variables, no credentials, and no config paths. It operates on local config files (.scripts_map.json and .execution_log.json) and calls local helper scripts. No unexplained credential requests were found.

✓ Persistence & Privilege

always:false and no installation hooks; the skill does not demand forced persistent inclusion. It writes logs to its own .execution_log.json file in-place but does not attempt to modify other skills or system-wide configuration.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install alert-to-fault-handling
安装完成后，直接呼叫该 Skill 的名称或使用 /alert-to-fault-handling 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

- Initial release of the "alert-to-fault-handling" skill. - Automatically detects alert context and recommends matched fault-handling scripts. - Supports confirmation-based script execution with step-by-step user interaction. - Integrates script mapping and execution log configuration via JSON files. - Enforces strict rules for user confirmation and secure handling. - Provides clear result feedback and optional auto-closure of alerts after successful actions.

元数据

Slug alert-to-fault-handling

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题