← Back to Skills Marketplace
dyagil

Services Watchdog

by dyagil · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
100
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install dyagil-services-watchdog
Description
Set up a systemd-based watchdog that keeps long-running Node.js services (Telegram bots, Express dashboards, etc.) alive across shell exits, ssh disconnects,...
README (SKILL.md)

Services Watchdog

Problem

Long-running Node services launched from a parent shell (or as children of an agent runtime) die when the parent exits. Runtime restarts are especially aggressive — they tend to take down everything they spawned as collateral damage. Manual nohup/setsid rituals survive an ssh disconnect but not a reboot.

Architecture

my-watchdog.timer          (systemd --user; OnUnitActiveSec=2min)
    ↓
my-watchdog.service        (Type=oneshot; KillMode=process)
    ↓
services-watchdog.sh
    ↓
for each service: check → if down → systemd-run --user --scope → exec node

Two non-obvious details make this actually work:

  1. KillMode=process + systemd-run --user --scope — without this, systemd kills the children of a Type=oneshot service as soon as the service exits. The combination puts each restarted service in its own transient scope, outside the watchdog's cgroup.
  2. .env is loaded INSIDE the new scope. The watchdog wraps the start command in bash -c 'cd \x3Cproject> && set -a && . ./.env; set +a && exec node \x3Centry>'. This propagates every env var without the watchdog having to know which ones the service needs (TELEGRAM_BOT_TOKEN, OPENAI_API_KEY, …).

Files

Rename the unit files to match your own prefix (e.g. mybot-watchdog.*) when adopting.

Install

WORKSPACE="$HOME/.openclaw/workspace"     # or wherever your projects live
mkdir -p "$WORKSPACE/scripts" "$WORKSPACE/logs" ~/.config/systemd/user

cp scripts/services-watchdog.sh   "$WORKSPACE/scripts/"
cp scripts/sahi-watchdog.service  ~/.config/systemd/user/
cp scripts/sahi-watchdog.timer    ~/.config/systemd/user/
chmod +x "$WORKSPACE/scripts/services-watchdog.sh"

systemctl --user daemon-reload
systemctl --user enable --now sahi-watchdog.timer
loginctl enable-linger "$USER"   # keeps the timer running when not logged in

Verify

# State after most recent run:
cat ~/.openclaw/workspace/memory/watchdog-state.json
# Recent recoveries / failures:
tail ~/.openclaw/workspace/logs/watchdog.log
# Schedule:
systemctl --user list-timers sahi-watchdog.timer --no-pager

End-to-end test (replace 4321 with the port your service listens on):

PID=$(ss -tlnp 2>/dev/null | awk '/:4321 /{print $NF}' | grep -oP 'pid=\K[0-9]+' | head -1)
kill "$PID"
systemctl --user start sahi-watchdog.service   # don't wait 2 min
ss -tln | grep 4321                            # should be listening again

(Do NOT use pkill -f "myservice/server.js" to kill the test target — your own exec shell often matches the same regex and gets SIGTERM'd.)

Adapt to a New Service

In services-watchdog.sh, add three things and append the service name to the services=() array:

check_myservice() {
  pgrep -f "\x3Cunique-marker-in-cmdline>" >/dev/null 2>&1
}

restart_myservice() {
  cd "$WORKSPACE/projects/myservice" || return 1
  systemd-run --user --scope --quiet --unit="myservice-$(date +%s%N)" \
    --setenv=PATH="$PATH" --setenv=HOME="$HOME" \
    bash -c 'cd '"$WORKSPACE"'/projects/myservice && set -a && [ -f .env ] && . ./.env; set +a && exec nohup node src/index.js >> logs/svc.log 2>&1 \x3C /dev/null' &
  disown 2>/dev/null || true
  sleep 3
  check_myservice
}

labels_myservice="My Service"

Gotchas (Learned the Hard Way)

  • Don't use Type=simple for the systemd service — that keeps the watchdog itself alive long after it should have exited, and it re-enters every 2 minutes.
  • PATH inside systemd-run --user --scope is minimal. Always pass --setenv=PATH="$PATH" if a child relies on ~/.npm-global/bin or similar; or call binaries by absolute path.
  • pgrep -f matches the watchdog shell itself. Use a unique marker (file path) when defining check_*, e.g. pgrep -f "myservice/src/index", not just pgrep -f "node src/index.js" which can collide with other projects.
  • Type=oneshot with default KillMode=control-group kills the children you just spawned. Always set KillMode=process AND launch via systemd-run --user --scope so the new process lives outside the watchdog's cgroup.

See Also

  • A taskflow or cron skill for one-shot scheduled tasks. The watchdog is for "always-on" services, not periodic jobs.
Usage Guidance
Do not enable this skill unmodified. First replace the hard-coded services and paths, remove or explicitly configure Telegram notifications, inspect or create the missing systemd service/timer files, and make sure you understand that loginctl linger keeps the watchdog active after logout.
Capability Tags
requires-sensitive-credentials
Capability Assessment
Purpose & Capability
A systemd watchdog for long-running Node.js services is coherent, but the included script is not a neutral template: it targets specific sahi-diet, sahi-mind, and mission-control projects and a named David Telegram chat.
Instruction Scope
The docs tell users to adapt the script, but the install path would enable the default script unless edited first, and the hard-coded Telegram notification is not clearly called out in the main setup instructions.
Install Mechanism
This is instruction-only and uses systemd user units, but the referenced service/timer files are not included in the manifest; users would need to inspect or create those units before enabling persistence.
Credentials
Loading service .env files is expected for this purpose, but the script also reads TELEGRAM_BOT_TOKEN from a specific project and uses it for a hard-coded external notification destination.
Persistence & Privilege
A 2-minute user systemd timer plus loginctl linger is central to the watchdog purpose and is disclosed, but it means the script can keep operating after logout and reboot until explicitly disabled.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install dyagil-services-watchdog
  3. After installation, invoke the skill by name or use /dyagil-services-watchdog
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release
Metadata
Slug dyagil-services-watchdog
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Services Watchdog?

Set up a systemd-based watchdog that keeps long-running Node.js services (Telegram bots, Express dashboards, etc.) alive across shell exits, ssh disconnects,... It is an AI Agent Skill for Claude Code / OpenClaw, with 100 downloads so far.

How do I install Services Watchdog?

Run "/install dyagil-services-watchdog" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Services Watchdog free?

Yes, Services Watchdog is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Services Watchdog support?

Services Watchdog is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Services Watchdog?

It is built and maintained by dyagil (@dyagil); the current version is v1.0.0.

💬 Comments