Cron Failure Runbook
/install cron-failure-runbook
Cron Failure Runbook
Use when a scheduled job, LaunchAgent, cron task, heartbeat step, or nightly automation fails, silently no-ops, produces incomplete output, or repeatedly generates dream-cycle failure proposals.
Goal
Turn unattended failures into reproducible evidence and one of three outcomes:
- Fixed and verified.
- Deferred with owner/date/reason.
- Escalated with the exact missing credential, approval, service, or runtime condition.
Procedure
-
Identify the scheduler context.
- Job name, plist/cron entry, command, cwd, shell, user, and expected environment.
- Last successful run and last failed/no-op run.
-
Reproduce in the same runtime lane.
- Run the exact command manually with the same env source where practical.
- Capture stdout, stderr, exit code, cwd, PATH, and relevant env variable presence without printing secret values.
- If the job depends on OpenClaw model calls, verify it uses gateway/Codex routing rather than raw OPENAI_API_KEY.
-
Run preflights before the expensive or external step.
- Auth: prove the running process can read the needed secret and make the smallest live API call.
- Files: prove input paths exist and output directories are writable.
- Network/service: prove target health endpoint or API is reachable.
- Approval: prove an external write has approval or a preapproved workflow flag.
-
Classify the failure.
- auth: missing/expired token, wrong vault, wrong runtime env, insufficient scope.
- runtime: wrong shell, PATH, Python/Node version, cwd, launchd env, permissions.
- input: missing/stale source files, empty queue, unexpected schema.
- external: API outage, 401/403, rate limit, deploy provider issue.
- logic: script exits zero but produces no expected artifact/action.
-
Close the loop.
- Fix code/config if local and reversible.
- Add a dry-run or preflight mode if the job cannot be safely tested live.
- Update the relevant STATUS/runbook/memory with evidence.
- If unresolved, record blocker, owner, next command, and alert threshold.
Verification Evidence
Every cron fix needs at least one of:
- Manual reproduction command with exit code and expected output.
- preflight-only or dry-run output proving dependencies are healthy.
- Scheduler log excerpt showing the next run succeeded.
- A deliberate deferred/blocked entry with owner, reason, and next check date.
Dream-Cycle Specific Checks
For dream-cycle failures:
- bash -n scripts/dream-cycle.sh
- python3 -m py_compile for every Python script touched by the cycle.
- scripts/task-quality-judge.py --since 7 --dry-run
- scripts/skill-evolver.py --since 7 --min-failures 2 --dry-run
- scripts/dream-recurring-issues.py --since 7 --min-count 3 --dry-run
- scripts/dream-cycle-action-summary.py --since-hours 26 --dry-run
Do not mark dream-cycle work complete if proposal files are merely pending. There must be a lifecycle status, a summary, and a next action.
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install cron-failure-runbook - 安装完成后,直接呼叫该 Skill 的名称或使用
/cron-failure-runbook触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
Cron Failure Runbook 是什么?
Runbook for diagnosing failed cron jobs, LaunchAgents, heartbeats, and unattended automation by reproducing the scheduler context, preflighting dependencies,... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 47 次。
如何安装 Cron Failure Runbook?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install cron-failure-runbook」即可一键安装,无需额外配置。
Cron Failure Runbook 是免费的吗?
是的,Cron Failure Runbook 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Cron Failure Runbook 支持哪些平台?
Cron Failure Runbook 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Cron Failure Runbook?
由 Nissan Dookeran(@nissan)开发并维护,当前版本 v1.0.0。