← 返回 Skills 市场
dyagil

Services Watchdog

作者 dyagil · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
100
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install dyagil-services-watchdog
功能描述
Set up a systemd-based watchdog that keeps long-running Node.js services (Telegram bots, Express dashboards, etc.) alive across shell exits, ssh disconnects,...
使用说明 (SKILL.md)

Services Watchdog

Problem

Long-running Node services launched from a parent shell (or as children of an agent runtime) die when the parent exits. Runtime restarts are especially aggressive — they tend to take down everything they spawned as collateral damage. Manual nohup/setsid rituals survive an ssh disconnect but not a reboot.

Architecture

my-watchdog.timer          (systemd --user; OnUnitActiveSec=2min)
    ↓
my-watchdog.service        (Type=oneshot; KillMode=process)
    ↓
services-watchdog.sh
    ↓
for each service: check → if down → systemd-run --user --scope → exec node

Two non-obvious details make this actually work:

  1. KillMode=process + systemd-run --user --scope — without this, systemd kills the children of a Type=oneshot service as soon as the service exits. The combination puts each restarted service in its own transient scope, outside the watchdog's cgroup.
  2. .env is loaded INSIDE the new scope. The watchdog wraps the start command in bash -c 'cd \x3Cproject> && set -a && . ./.env; set +a && exec node \x3Centry>'. This propagates every env var without the watchdog having to know which ones the service needs (TELEGRAM_BOT_TOKEN, OPENAI_API_KEY, …).

Files

Rename the unit files to match your own prefix (e.g. mybot-watchdog.*) when adopting.

Install

WORKSPACE="$HOME/.openclaw/workspace"     # or wherever your projects live
mkdir -p "$WORKSPACE/scripts" "$WORKSPACE/logs" ~/.config/systemd/user

cp scripts/services-watchdog.sh   "$WORKSPACE/scripts/"
cp scripts/sahi-watchdog.service  ~/.config/systemd/user/
cp scripts/sahi-watchdog.timer    ~/.config/systemd/user/
chmod +x "$WORKSPACE/scripts/services-watchdog.sh"

systemctl --user daemon-reload
systemctl --user enable --now sahi-watchdog.timer
loginctl enable-linger "$USER"   # keeps the timer running when not logged in

Verify

# State after most recent run:
cat ~/.openclaw/workspace/memory/watchdog-state.json
# Recent recoveries / failures:
tail ~/.openclaw/workspace/logs/watchdog.log
# Schedule:
systemctl --user list-timers sahi-watchdog.timer --no-pager

End-to-end test (replace 4321 with the port your service listens on):

PID=$(ss -tlnp 2>/dev/null | awk '/:4321 /{print $NF}' | grep -oP 'pid=\K[0-9]+' | head -1)
kill "$PID"
systemctl --user start sahi-watchdog.service   # don't wait 2 min
ss -tln | grep 4321                            # should be listening again

(Do NOT use pkill -f "myservice/server.js" to kill the test target — your own exec shell often matches the same regex and gets SIGTERM'd.)

Adapt to a New Service

In services-watchdog.sh, add three things and append the service name to the services=() array:

check_myservice() {
  pgrep -f "\x3Cunique-marker-in-cmdline>" >/dev/null 2>&1
}

restart_myservice() {
  cd "$WORKSPACE/projects/myservice" || return 1
  systemd-run --user --scope --quiet --unit="myservice-$(date +%s%N)" \
    --setenv=PATH="$PATH" --setenv=HOME="$HOME" \
    bash -c 'cd '"$WORKSPACE"'/projects/myservice && set -a && [ -f .env ] && . ./.env; set +a && exec nohup node src/index.js >> logs/svc.log 2>&1 \x3C /dev/null' &
  disown 2>/dev/null || true
  sleep 3
  check_myservice
}

labels_myservice="My Service"

Gotchas (Learned the Hard Way)

  • Don't use Type=simple for the systemd service — that keeps the watchdog itself alive long after it should have exited, and it re-enters every 2 minutes.
  • PATH inside systemd-run --user --scope is minimal. Always pass --setenv=PATH="$PATH" if a child relies on ~/.npm-global/bin or similar; or call binaries by absolute path.
  • pgrep -f matches the watchdog shell itself. Use a unique marker (file path) when defining check_*, e.g. pgrep -f "myservice/src/index", not just pgrep -f "node src/index.js" which can collide with other projects.
  • Type=oneshot with default KillMode=control-group kills the children you just spawned. Always set KillMode=process AND launch via systemd-run --user --scope so the new process lives outside the watchdog's cgroup.

See Also

  • A taskflow or cron skill for one-shot scheduled tasks. The watchdog is for "always-on" services, not periodic jobs.
安全使用建议
Do not enable this skill unmodified. First replace the hard-coded services and paths, remove or explicitly configure Telegram notifications, inspect or create the missing systemd service/timer files, and make sure you understand that loginctl linger keeps the watchdog active after logout.
能力标签
requires-sensitive-credentials
能力评估
Purpose & Capability
A systemd watchdog for long-running Node.js services is coherent, but the included script is not a neutral template: it targets specific sahi-diet, sahi-mind, and mission-control projects and a named David Telegram chat.
Instruction Scope
The docs tell users to adapt the script, but the install path would enable the default script unless edited first, and the hard-coded Telegram notification is not clearly called out in the main setup instructions.
Install Mechanism
This is instruction-only and uses systemd user units, but the referenced service/timer files are not included in the manifest; users would need to inspect or create those units before enabling persistence.
Credentials
Loading service .env files is expected for this purpose, but the script also reads TELEGRAM_BOT_TOKEN from a specific project and uses it for a hard-coded external notification destination.
Persistence & Privilege
A 2-minute user systemd timer plus loginctl linger is central to the watchdog purpose and is disclosed, but it means the script can keep operating after logout and reboot until explicitly disabled.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install dyagil-services-watchdog
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /dyagil-services-watchdog 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release
元数据
Slug dyagil-services-watchdog
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

Services Watchdog 是什么?

Set up a systemd-based watchdog that keeps long-running Node.js services (Telegram bots, Express dashboards, etc.) alive across shell exits, ssh disconnects,... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 100 次。

如何安装 Services Watchdog?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install dyagil-services-watchdog」即可一键安装,无需额外配置。

Services Watchdog 是免费的吗?

是的,Services Watchdog 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Services Watchdog 支持哪些平台?

Services Watchdog 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Services Watchdog?

由 dyagil(@dyagil)开发并维护,当前版本 v1.0.0。

💬 留言讨论