功能描述

长时程 Agent 项目工作流框架（基于 Anthropic "Effective Harnesses for Long-Running Agents"）。用于创建、管理和调度跨多个上下文窗口的长期项目任务。 Use when: 启动新项目、初始化项目工作流、管理项目任务列表、调度子Agent增量开发、恢复项...

使用说明 (SKILL.md)

长时程 Agent 工作流框架

Name: long-running-harness
Author: aowind

基于 Anthropic 工程团队的 Effective Harnesses for Long-Running Agents 方法论，适配 OpenClaw 环境。

核心原则

持久化优于记忆 — 用文件系统记录状态，不依赖 Agent 上下文记忆
结构化优于自由文本 — 关键状态用 JSON，进度日志用 Markdown
验证优于声明 — 每个功能完成后必须验证，不接受未测试的 "完成"
增量优于大步 — 每次 Agent 会话只做一个功能点，保持可回滚
标准化优于临时 — 固定的启动例程和结束例程，减少混乱

项目结构

每个受管理项目遵循以下标准结构：

projects/\x3Cproject-name>/
├── PROJECT.md              # 项目概述、目标、技术栈
├── progress.md             # Agent 工作日志（每次会话追加）
├── features.json           # 功能列表（状态追踪，仅修改 passes 字段）
├── init.sh                 # 环境初始化脚本（可选）
├── src/                    # 项目源码
└── tests/                  # 测试代码（如有）

生命周期

阶段一：初始化（Init）

当用户要求启动新项目或初始化工作流时执行：

创建项目目录 projects/\x3Cproject-name>/
编写 PROJECT.md — 包含：
- 项目名称和目标
- 技术栈和依赖
- 验收标准
- 关键约束

编写 features.json — 功能列表，格式如下：

{
  "project": "项目名称",
  "created": "2026-03-18",
  "features": [
    {
      "id": "feat-001",
      "name": "功能名称",
      "description": "功能详细描述",
      "category": "functional|infra|docs|perf|fix",
      "priority": "high|medium|low",
      "passes": false,
      "tests": [
        "测试步骤 1",
        "测试步骤 2"
      ],
      "notes": ""
    }
  ]
}

创建 progress.md — 模板：

# 项目工作日志

## 初始化
- 日期：2026-03-18
- 初始化人：主 Agent
- 功能总数：N

初始化 git — git init + 首次提交
如果适用，编写 init.sh

重要： 功能列表要尽量详尽，把大功能拆成小功能。200 个小功能 > 10 个大功能。

阶段二：增量开发（Each Session）

当用户说"继续开发"、"next feature"、"继续项目"或调度子Agent开发时：

启动例程（每个会话必须执行）：

pwd 确认工作目录
cat projects/\x3Cname>/progress.md — 读取工作日志
git log --oneline -10 — 查看最近提交
cat projects/\x3Cname>/features.json — 读取功能列表
选择优先级最高且 passes: false 的功能
运行 init.sh（如有）+ 基础验证测试
确认环境正常后，开始实现

工作约束：

每次只做一个功能
实现完成后必须验证（运行测试、手动检查等）
验证通过后才能将 features.json 中对应功能的 passes 改为 true
禁止删除或修改功能条目（只改 passes 和 notes 字段）

结束例程（每个会话必须执行）：

更新 features.json 中完成状态

追加会话记录到 progress.md：

## 会话 N — 日期
- **目标功能：** feat-XXX - 功能名称
- **状态：** ✅ 完成 / ⏳ 部分完成 / ❌ 失败
- **完成内容：** 具体做了什么
- **遇到的问题：** 问题描述和解决方案
- **下次继续：** 待办事项
- **Git commits：** hash - message

git add . && git commit -m "feat: 完成功能描述"

阶段三：进度报告

当用户问"项目进度"、"project status"时：

读取 features.json
统计完成率（passes: true / 总数）
按优先级列出未完成功能
读取 progress.md 最近条目
生成进度摘要

输出格式：

📋 项目进度：项目名称
━━━━━━━━━━━━━━━━━━
✅ 完成：X / Y（Z%）
🔴 待做（高优先级）：...
🟡 待做（中优先级）：...
🟢 待做（低优先级）：...
━━━━━━━━━━━━━━━━━━
最近会话：[简要摘要]

调度子Agent（sessions_spawn）

将单个功能委派给子Agent开发时，task 描述必须自包含：

{
  "task": "## 任务：实现 feat-XXX 功能\
\
### 项目信息\
- 路径：projects/project-name/\
- 技术栈：...\
\
### 你的目标\
实现以下功能并验证：\
[功能描述]\
\
### 启动例程\
1. 读取 projects/project-name/features.json 找到 feat-XXX\
2. 读取 projects/project-name/progress.md 了解历史\
3. 运行 git log --oneline -5\
4. 运行 projects/project-name/init.sh（如有）\
5. 运行基础测试确认环境正常\
\
### 工作要求\
- 只做这一个功能\
- 完成后必须验证\
- 结束时更新 features.json 的 passes 字段\
- 结束时追加 progress.md 日志\
- 结束时 git commit",
  "sessionKey": "alpha",
  "runTimeoutSeconds": 600
}

关键： task 必须包含所有上下文。子Agent看不到主对话历史。

定时巡检（Cron Job）

对于重要项目，可设置定时 cron job 巡检进度：

schedule: kind=cron, expr="0 */4 * * *"
payload: kind=agentTurn, message="读取 projects/\x3Cname>/progress.md 和 features.json，检查是否有功能卡住超过3个会话未完成。如有，输出简要报告。"

故障模式预防

故障模式	预防措施
Agent 试图一次性做完所有功能	强制每次只选一个 `passes: false` 的功能
Agent 过早宣布项目完成	`features.json` 有明确的状态追踪
Agent 留下的代码有 bug	启动时运行基础测试；结束时 git commit 便于回滚
Agent 花时间理解环境	使用 `init.sh` 标准化启动
上下文丢失导致重复工作	`progress.md` + git log 提供完整历史
功能未真正完成就标记 passes	要求验证后才能修改 passes 字段

通用领域扩展

此框架不限于软件开发。对于非代码类长期任务：

研究项目： features.json 中的 tests 改为 research objectives，passes 表示研究是否完成
写作项目： features 拆分为章节/段落，passes 表示是否已写完并审校
数据分析： features 拆分为分析步骤，passes 表示结果是否已验证

使用不同 category 区分：research|writing|analysis|infra|docs

安全使用建议

This skill appears to do what it says (manage long-running project tasks) but it will run shell commands and project-provided init.sh scripts that can install packages, start services, and execute arbitrary code. Things to consider before installing or using it: - Only run this on repositories/projects you trust. Review any init.sh, init_db.sh, package.json, requirements.txt, and test scripts before allowing the agent to execute them. - The skill assumes tools (git, bash, curl, npm, pip, python, pytest) are available but the metadata doesn't declare them — ensure your environment provides these or update the skill metadata to reflect requirements. - The agent will perform git commits; ensure your git credentials and remote configuration are what you expect, and don't let it commit secrets or credentials into repos. - If you need strong containment, run the harness in an isolated/sandboxed environment (container/VM) or deny network access so package installs cannot fetch remote code. - Be cautious with cron job scheduling — the skill includes an example for periodic checks; only enable scheduled runs when you have control over what the agent will execute autonomously. If you want to reduce risk: require explicit user confirmation before running any init.sh or performing installs, restrict which project paths the skill can act on, and add explicit declarations of required binaries and any expected external credentials.

功能分析

Type: OpenClaw Skill Name: long-running-harness Version: 1.0.0 The bundle provides a structured framework for managing long-running agent projects, implementing state persistence via the file system as recommended by Anthropic's methodology. It includes templates for project initialization (init-template.md), feature tracking (feature-list-template.md), and progress logging (progress-log-template.md). While the framework facilitates the execution of shell scripts for environment setup and uses structured instructions to guide agent behavior, no evidence of malicious intent, unauthorized data access, or harmful prompt injection was found.

能力评估

ℹ Purpose & Capability

The name/description (long-running project harness) matches the actions in SKILL.md: creating project folders, reading/writing features.json and progress.md, selecting tasks, committing git changes, spawning child sessions, and optionally scheduling cron checks. However the instructions assume availability of system tools (git, bash, curl, npm, pip, python, pytest) and the ability to run project-provided init.sh and tests, yet the skill metadata lists no required binaries or environment requirements — an incoherence that should be addressed.

ℹ Instruction Scope

Instructions remain within the declared purpose (manage project state, run one feature per session, enforce tests, update files, git commit). But they explicitly direct the agent to run shell commands and any project-provided init.sh, which can execute arbitrary code, perform package installs, start services, and make network requests. That behavior is expected for a harness but expands the runtime authority significantly and should only be applied to trusted project repositories or sandboxed environments.

✓ Install Mechanism

There is no install spec (instruction-only), which is low risk from the skill distribution perspective. However the provided init.sh templates instruct package installation (npm, pip) and starting services — these actions would download and run third-party code at runtime. The lack of an install step in the skill itself is consistent, but users should note that the skill will routinely execute repository scripts that may install software.

ℹ Credentials

The skill declares no required environment variables or credentials, which aligns with its general purpose. But practical execution often relies on system credentials (git remotes requiring git credentials, DB access, package registry/network access). The skill does not request or document these, so users must be aware the agent may attempt operations that implicitly depend on external credentials or network access.

✓ Persistence & Privilege

always:false and normal autonomous invocation settings are used. The skill writes/commits to project directories (its intended scope) but does not request persistent system-wide privileges. The scheduling/cron example can make the agent run periodically, so users should control whether and how those schedules are created.

版本历史

v1.0.0

# Changelog — long-running-harness 所有格式基于 [Keep a Changelog](https://keepachangelog.com/zh-CN/1.1.0/)，版本号遵循 [语义化版本](https://semver.org/lang/zh-CN/)。 --- ## [0.1.0] — 2026-03-18 ### 新增 - 初始版本发布 - 基于方法论：[Effective Harnesses for Long-Running Agents](https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents)（Anthropic Engineering Blog, Justin Young, 2025-11-26） - **SKILL.md** 核心技能文件 - 5 条核心原则：持久化优于记忆、结构化优于自由文本、验证优于声明、增量优于大步、标准化优于临时 - 三阶段项目生命周期：初始化 → 增量开发 → 进度报告 - 标准化启动/结束例程（每次会话强制执行） - 子Agent调度模板（sessions_spawn，task 自包含设计） - Cron Job 定时巡检配置示例 - 故障模式预防表（6 种常见 Agent 失败模式及对策） - 通用领域扩展支持：软件开发、研究、写作、数据分析（通过 category 区分） - **功能列表模板** `references/feature-list-template.md` - features.json 标准格式：id / name / description / category / priority / passes / tests / notes - 8 种 category 前缀规范：feat- / infra- / docs- / perf- / fix- / research- / writing- / analysis- - Agent 只能修改 passes 和 notes 字段的约束说明 - **初始化脚本模板** `references/init-template.md` - Web 应用模板（Node.js / Python 通用） - Python 项目模板（含 venv 管理） - 含冒烟测试步骤，确保每次启动快速验证环境 - **进度日志模板** `references/progress-log-template.md` - progress.md 标准格式与会话记录规范 - 状态标记体系：✅ 完成 / ⏳ 部分完成 / ❌ 失败 - 书写规范：不修改历史条目、记录 Git commits hash

元数据

Slug long-running-harness

版本 1.0.0

许可证 MIT-0

累计安装 1

当前安装数 1

历史版本数 1

常见问题

long-running-harness 是什么？

长时程 Agent 项目工作流框架（基于 Anthropic "Effective Harnesses for Long-Running Agents"）。用于创建、管理和调度跨多个上下文窗口的长期项目任务。 Use when: 启动新项目、初始化项目工作流、管理项目任务列表、调度子Agent增量开发、恢复项... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 234 次。

如何安装 long-running-harness？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install long-running-harness」即可一键安装，无需额外配置。

long-running-harness 是免费的吗？

是的，long-running-harness 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

long-running-harness 支持哪些平台？

long-running-harness 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 long-running-harness？

由 Aowind（@aowind）开发并维护，当前版本 v1.0.0。

long-running-harness