/install claude-watchdog
Claude Watchdog 🐕
Monitor the Anthropic/Claude API for outages and latency spikes. Sends rich alerts to Telegram — no agent tokens consumed for status checks.
What It Does
Status Monitor (status-check.py)
- Polls
status.claude.comevery 15 minutes via cron - Alerts with incident name, latest update text, per-component status
- Tags incidents as "(not our model)" if e.g. Haiku is affected but you use Sonnet
- Sends all-clear on recovery
- Zero token cost
Latency Probe (latency-probe.py)
- Sends a minimal request through OpenClaw's local gateway every 15 minutes
- Measures real end-to-end latency to Anthropic API
- Maintains rolling baseline (median of last 20 samples)
- Alerts with 🟡/🟠/🔴 severity based on spike magnitude
- Sends all-clear when latency recovers
- ~$0.000001 per probe
Setup
Run the interactive setup script:
bash /path/to/skills/claude-watchdog/scripts/setup.sh
You'll need:
- Telegram Bot Token — from @BotFather
- Telegram Chat ID — send a message to your bot, then check
https://api.telegram.org/bot\x3CTOKEN>/getUpdates - OpenClaw Gateway Token — run:
python3 -c "from pathlib import Path; import json; print(json.load(open(Path.home() / '.openclaw/openclaw.json'))['gateway']['auth']['token'])" - Gateway Port — default
18789
The setup script writes config, installs cron jobs, and runs an initial check.
To uninstall (removes cron jobs, optionally config/state):
bash /path/to/skills/claude-watchdog/scripts/setup.sh --uninstall
Config
Stored in ~/.openclaw/skills/claude-watchdog/claude-watchdog.env. To reconfigure, either re-run setup.sh or edit this file directly — changes take effect on the next cron run (within 15 minutes).
TELEGRAM_BOT_TOKEN=...
TELEGRAM_CHAT_ID=...
OPENCLAW_GATEWAY_TOKEN=...
OPENCLAW_GATEWAY_PORT=18789
MONITOR_MODEL=sonnet
PROBE_MODEL=openclaw
PROBE_AGENT_ID=main
| Variable | Default | Description |
|---|---|---|
TELEGRAM_BOT_TOKEN |
(required) | Telegram bot token from @BotFather |
TELEGRAM_CHAT_ID |
(required) | Target chat for alerts |
OPENCLAW_GATEWAY_TOKEN |
(required) | Auth token for the local OpenClaw gateway |
OPENCLAW_GATEWAY_PORT |
18789 |
Port the OpenClaw gateway listens on |
MONITOR_MODEL |
sonnet |
Model name to match in status incidents (e.g. "sonnet", "haiku") |
PROBE_MODEL |
openclaw |
Model alias sent to the gateway for latency probes. openclaw uses the gateway's default model routing |
PROBE_AGENT_ID |
main |
Value of the x-openclaw-agent-id header sent with probes |
FILTER_KEYWORDS |
(none) | Comma-separated keywords to filter out of status alerts (e.g. "skills,Artifacts,Memory"). Empty = receive all alerts |
Scripts also accept these as environment variables (env file takes priority).
Security Note
The env file contains sensitive tokens (Telegram bot token, gateway token). The setup script sets permissions to 600 (owner-only read/write). If you create or edit the file manually, ensure restricted permissions:
chmod 600 ~/.openclaw/skills/claude-watchdog/claude-watchdog.env
Alert Examples
Status incident:
🟠 Anthropic Status: Partially Degraded Service
📌 Elevated error rates on Claude 3.5 Haiku (not our model)
Status: Investigating
Update: "We are investigating increased error rates..."
Components:
🟠 API: partial outage
🔗 https://status.claude.com
Latency spike:
🟡 Anthropic API — High Latency Detected
Current: 12.3s
Baseline: 3.1s (median of last 19 samples)
Ratio: 4.0×
Slow responses are expected right now.
Recovery:
✅ Anthropic API — Latency Back to Normal
Current: 2.8s
Baseline: 3.1s
Was: 12.3s when alert fired
State & Logs
All state and log files are stored in ~/.openclaw/skills/claude-watchdog/:
| File | Purpose |
|---|---|
claude-watchdog-status.json |
Status check state |
claude-watchdog-latency.json |
Latency probe state & samples |
claude-watchdog-status.log |
Status check log |
claude-watchdog-latency.log |
Latency probe log |
Tuning Thresholds
Edit constants at the top of latency-probe.py:
| Constant | Default | Meaning |
|---|---|---|
ALERT_MULTIPLIER |
2.5 | Alert if latency > N× baseline median |
ALERT_HARD_FLOOR |
10.0s | Always alert above this absolute threshold |
RECOVER_MULTIPLIER |
1.5 | Clear alert when below N× baseline |
BASELINE_WINDOW |
20 | Rolling sample window size |
BASELINE_MIN_SAMPLES |
5 | Minimum samples before alerting starts |
PROBE_TIMEOUT |
45s | Give up on probe after this long |
Requirements
- Python 3.10+ (stdlib only, no pip dependencies)
- OpenClaw gateway running locally
- Telegram bot with access to the target chat
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install claude-watchdog - 安装完成后,直接呼叫该 Skill 的名称或使用
/claude-watchdog触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
Skill 是什么?
Monitor the Claude API for outages and latency spikes with rich Telegram alerts. Status monitoring, latency probes, and automatic recovery notifications. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 644 次。
如何安装 Skill?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install claude-watchdog」即可一键安装,无需额外配置。
Skill 是免费的吗?
是的,Skill 完全免费(开源免费),可自由下载、安装和使用。
Skill 支持哪些平台?
Skill 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Skill?
由 chapati(@chapati23)开发并维护,当前版本 v1.2.1。