← 返回 Skills 市场

Harness Engineer

Name: Harness Engineer
Author: louis-szeto

作者 Louis Szeto · GitHub ↗ · v5.3.0 · MIT-0

cross-platform ✓ 安全检测通过

485

总下载

当前安装

版本数

在 OpenClaw 中安装

/install harness-engineer

功能描述

A persistent autonomous engineering harness runtime that transforms any repository into a self-improving software system. Use this skill whenever the user wa...

使用说明 (SKILL.md)

HARNESS ENGINEER

A production-grade skill for Claude Code and OpenClaw that transforms a repository into a self-improving software system using six core harness engineering principles.

SIX CORE PRINCIPLES

P1: CONTEXT ENGINEERING

Treat context as a finite, precious resource. Curate aggressively. See: runtime/context-engineering.md, runtime/compaction.md

P2: TOOL USAGE

Each sub-agent receives only the tools it needs -- no more. See: tools/TOOL_REGISTRY.md, references/mcp-tools.md

P3: VERIFICATION MECHANISM

Every output is verified by someone other than who produced it. See: agents/reviewer.md, references/testing-standards.md

P4: STATUS MANAGEMENT

State lives outside the context window, in the repo. See: runtime/status-management.md, templates/handoff.md

P5: OBSERVABILITY AND FEEDBACK CLOSED-LOOP

Track what happens. Feed failures back into the harness, not the code. See: runtime/observability.md, runtime/memory-system.md

P6: HUMAN SUPERVISION

Humans approve high-impact events. The harness surfaces them explicitly. See: runtime/autonomy-rules.md, runtime/prioritization.md

NON-NEGOTIABLE RULES

CLAUDE.md / AGENTS.md IS GROUND TRUTH -- read it first, every session.
CODEBASE OVER DOCS -- when they conflict, trust the code.
40% CONTEXT RULE -- compact or sub-agent before crossing 40% of context window.
NO IMPLEMENTATION WITHOUT a research output, a plan, and validation criteria.
GENERATION AND REVIEW ARE ALWAYS SEPARATE -- never the same agent.
FAILURE = HARNESS GAP -- fix the harness, not just the symptom.
OPTIMIZATION PRIORITY: Security => Correctness => Reliability => Performance => Memory => Maintainability => Cost
MINIMAL SCOPE PER SUBAGENT — Estimate codebase size first (use platform file-count or line-count capability — never raw shell). If >5K lines, split into multiple subagents by module/feature/layer. Pin exact files to read (no wandering). One research doc + one code area per subagent max. If a subagent gets killed or times out, the scope was too large — split further. Adaptive timeouts: Default timeouts are guidelines, not hard kills. Check process logs before killing — if the agent is actively producing output, extend the timeout instead of killing. Only kill-and-split if the agent is silent/stuck for >10min or producing garbage. Scale timeouts by effort: S-effort=15min, M-effort=20min, L-effort=30-40min.
SUBAGENT PERMISSION MODE — Subagents are spawned by the platform using its native agent mechanism. The permission mode is set by the platform, NOT by this skill. The skill MUST NOT mandate any specific spawn command or permission mode — that decision belongs to the platform's enforcement layer (see PLATFORM_REQUIREMENTS.md Section 8). If the platform's default permission mode is insufficient, the platform operator configures it — the skill never overrides it.
ACTIVE MONITORING — Every time you launch a new batch of subagents, track session IDs, expected output files, and remaining queue. If the platform provides a cron/scheduler, use it to detect dead agents. If no scheduler is available, check agent status before each dispatch step. Dead agents stall the pipeline — detect them early.
MAX PARALLEL = 5 — Up to 5 Claude Code agents running simultaneously. If rate/API limit errors encountered, drop to 4, then 3, etc until no errors. Resume increasing after 5 clean minutes.
TOKEN EXHAUSTION RECOVERY: If ALL active agents hit rate/API limits (429/500), tokens are exhausted. Wait for token refresh before retrying. If the platform provides a scheduler, set a recovery job to resume after refresh. If no scheduler is available, the human operator must manually restart the cycle.
10-MIN STUCK KILL — If any agent produces no output for >10 minutes, log the issue, kill it, and split the task into smaller subtasks before respawning. MUST ALWAYS set a cron job when a subagent is given a command that will run for a while, to periodically check on its progress.
TRACKING EVERYWHERE — Every phase, cycle, and step writes to tracking logs. DISPATCH-TRACK, error log, compact summaries, progress logs. Recovery must be able to pick up from any interruption point.

SAFE START GUIDE

Before anything else: read PLATFORM_REQUIREMENTS.md and verify every item. The harness depends on platform enforcement that cannot be checked from these files alone.

Step 1 -- Verify platform requirements (PLATFORM_REQUIREMENTS.md) Run through the five platform capability checks before any other step.

Step 2 -- Sandbox first Run on a throwaway branch. Observe one single-pass cycle before enabling continuous mode.

PRs always require human approval. There is no auto-merge.

Step 3 -- Protect main branch Require human reviewers on main/trunk in your git host.

Step 4 -- Graduation path: single-pass => maintenance => continuous

HOW TO USE THIS SKILL

When activated in Claude Code or OpenClaw, read in this order:

CLAUDE.md or AGENTS.md if present (base context)
CONFIG.yaml (runtime settings)
runtime/loop.md (execution model)
runtime/context-engineering.md (context budget rules)
runtime/status-management.md (restore checkpoint if resuming)
MEMORY.md (prior failure context)
agents/dispatcher.md (task decomposition model, worktree agent)
Begin the loop

REFERENCE FILES

File	When to read
CLAUDE.md / AGENTS.md	First, every session -- base knowledge
CONFIG.yaml	At startup
MEMORY.md	At startup and after every failure
runtime/loop.md	Each loop cycle
runtime/context-engineering.md	Continuously -- governs context budget
runtime/compaction.md	When compacting context within a phase
runtime/status-management.md	At startup (resume) and after each task
runtime/observability.md	After VERIFY phase
runtime/memory-system.md	When writing or querying memory
runtime/self-improvement.md	After any failure
runtime/prioritization.md	When selecting the next task
runtime/autonomy-rules.md	When blocked or at human gate
agents/dispatcher.md	Before decomposing any task (worktree agent)
agents/researcher.md	Research phase (Q-Agent + R-Agent model)
agents/planner.md	Plan phase (3-phase: design, outline, master plan)
agents/implementer.md	Implement phase (worktree-driven execution)
agents/reviewer.md	Review cycle
agents/debugger.md	On any failure
agents/optimizer.md	Optimization mode
agents/garbage-collector.md	GC interval
tools/TOOL_REGISTRY.md	Before any tool call
tools/tool-router.md	Routing and redaction rules
tools/execution-protocol.md	Full tool call lifecycle
references/harness-rules.md	Core constraints
references/testing-standards.md	Before writing or running tests
references/security-performance.md	Before any implementation
references/simplification-checklist.md	During review and refactoring
references/git-workflow.md	Before any commit or PR
references/mcp-tools.md	MCP tool definitions and per-agent sets
references/sensitive-paths.md	Forbidden read paths -- enforced in-skill
references/constraints.md	Active prevention rules
templates/	Plans, ADRs, handoffs, status docs

安全使用建议

This skill is a documentation-driven orchestration runtime (no code bundled) and is internally consistent with its purpose. It delegates all enforcement to the host platform — before enabling autonomous runs, verify the platform actually provides: (1) a tool router that blocks protected-path writes, redacts secrets, and logs BLOCKED_READ/BLOCKED_WRITE events; (2) sandboxed test execution with no access to host env vars or harness files; (3) fine‑grained git credentials scoped to the current repo and branch protections (no direct push to main); and (4) human approval gates for plan/PR/critical actions. If you cannot confirm those controls, run the skill only in single-pass/manual mode and do not enable continuous/autonomous operation. Because the skill can orchestrate commits, tests, and subagents, lack of platform enforcement increases the risk of unwanted changes or data exposure — ensure you test the harness in a safe repository first and confirm logs and redaction behavior before trusting it with production repos.

功能分析

Type: OpenClaw Skill Name: harness-engineer Version: 5.3.0 The harness-engineer skill bundle is a highly sophisticated and security-conscious autonomous engineering framework. It implements a multi-agent system with strict role separation (Researcher, Planner, Implementer, Reviewer) and incorporates extensive defensive measures, including a centralized tool router that enforces sensitive path exclusions (references/sensitive-paths.md), blocks raw shell execution (tools/TOOL_REGISTRY.md), and mandates human approval gates for all critical phases (runtime/autonomy-rules.md). The system is designed to operate within a sandboxed environment with scoped git credentials, and its self-improvement mechanisms are restricted to human-approved variants, effectively mitigating risks associated with autonomous code generation.

能力标签

cryptocan-make-purchases

能力评估

✓ Purpose & Capability

The skill is an instruction-only 'harness' for running multi-agent engineering cycles. It requests no binaries, env vars, or installs and all declared tool usage (read_file, write_file, git_*, web_search, test runners) matches the stated purpose of transforming repositories and orchestrating agents. The extensive protected-path and router requirements are appropriate for a harness.

ℹ Instruction Scope

SKILL.md and supporting docs instruct agents to read repo files, run tests, create branches/PRs, and spawn subagents. These are within the declared purpose, but they assume the platform enforces a central tool router, sandboxed test execution, scoped git credentials, and human approval gates. If the host platform does not implement those enforcement points, the instructions could enable risky autonomous behavior. The skill repeatedly warns to verify platform requirements before use.

✓ Install Mechanism

Instruction-only; no install spec, no downloads, and no code execution packaged with the skill. This minimizes on-disk footprint and avoids installing arbitrary binaries.

✓ Credentials

The skill declares no required environment variables, no credentials, and no config paths. It explicitly instructs the platform to manage scheduler credentials and git tokens and forbids writing credentials to memory or logs. The requested environment access is proportionate to an orchestration/instruction-only skill.

ℹ Persistence & Privilege

always:false (not force-included). Model invocation/autonomous invocation is allowed (platform default). This is expected for an autonomous harness, but because the skill orchestrates actions that can modify a repo (commits/PRs, tests, staging docs) users must ensure platform-enforced human gates, tool-routing, and git scoping are present. The skill itself repeatedly mandates human approval gates and protected paths.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install harness-engineer
安装完成后，直接呼叫该 Skill 的名称或使用 /harness-engineer 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v5.3.0

- More extensive review scope - Detailed nested subtasks breakdown instructions - accept Obsidian-type LLM Wiki - Allow clean pass only for reviewers - Reinforced ITR loop - Updated the "10-MIN STUCK KILL" rule to require that a cron job is always set when a subagent is given a long-running command, to periodically check its progress.

v5.2.2

slight security improvement

v5.2.1

- Clarified subagent file/line counting: specify use of platform capabilities, not raw shell commands. - Revised subagent permission mode rule: defer to platform-native agent spawning and permissions, do not mandate specific commands. - Updated dispatcher guidance: specify worktree agent role. - Minor copyedits and headings for clarity. - No file or functional changes—documentation/instructions only.

v5.2.0

- Minor documentation update: clarified the purpose of the agents/dispatcher.md file in the usage sequence ("task decomposition model, worktree agent"). - No code or functional changes.

v5.1.0

harness-engineer v5.1.0 - Extended non-negotiable rules with detailed subagent scoping, adaptive timeouts, permission handling, agent monitoring, and parallelism limits. - Added new reference: references/simplification-checklist.md. - Updated platform enforcement and failure recovery guidelines. - Improved process tracking and robustness for multi-agent orchestration. - Enhanced stuck/timeout handling and token exhaustion recovery instructions.

v5.0.0

**Major update: new orchestration rules and automation reliability improvements.** - Added detailed new non-negotiable rules to enforce minimal subagent scope, adaptive timeouts, agent monitoring via cron, hard rules for stuck agents, and full Claude Code agentization. - Increased emphasis and detail on tracking, recovery from token exhaustion, and parallel agent limits with auto-scaling on errors. - Introduced references/simplification-checklist.md as a new documentation resource. - Enhanced agent management sections for improved robustness and automation stability. - No breaking changes to existing harness usage; compatible with prior configuration.

v4.0.0

### Summary: Major feature expansion and improved modularity for harness-engineer runtime. - Added 11 new reference and runtime files, including compaction, config-system, cost-tracking, error-recovery, hook-system, instruction-discovery, mechanical-enforcement, self-assessment, and more. - Core principles now reference new modular files (e.g., compaction.md), clarifying where to read for details. - Expanded and reorganized the reference files list in SKILL.md for easier lookup and more granular process documentation. - New README.md and template files included for agents, contracts, and planning, improving onboarding and extensibility. - Internal documentation and runtime architecture now better support self-assessment, mechanical enforcement, and context management.

v3.2.2

- Added metadata section specifying required platform enforcement and safe defaults. - Declared explicit platform requirements: MCP tool router, sandboxed test execution, git credential scoping, and human approval gates. - Emphasized instruction-only use and operation safety depending on platform features. - No functional or procedural logic changes; documentation/metadata only.

v3.2.1

harness-engineer 3.2.1 - Added new reference file: references/sensitive-paths.md. - No changes to core logic or workflow. - Provides additional documentation for sensitive paths handling.

v3.2.0

Version 3.1.1 - Added templates/final-review.md for standardized final review procedures. - Added templates/gap-report.md to document and track harness gaps. - Added details on small tasks breakdown and parallel processing - No changes to core harness logic or principles; these new templates support improved documentation and continuous improvement processes.

v3.1.0

v3.1.0 - break tasks into pieces and handled by each agent-group loop-cycle - mild security refinement

v3.0.1

## harness-engineer 3.0.1 - Added PLATFORM_REQUIREMENTS.md to specify essential platform prerequisites. - Updated SAFE START GUIDE to require reviewing and verifying PLATFORM_REQUIREMENTS.md before any other setup step. - Emphasized platform capability checks as a prerequisite for running the harness.

v3.0.0

**Major update: Introduces six core harness engineering principles, new agent roles, and explicit runtime/observability mechanisms.** - Added detailed documentation for agent roles (planner, researcher) and core runtime mechanisms (context engineering, observability, status management). - Introduced six explicit harness engineering principles to guide runtime, context, verification, tool usage, feedback loops, and human supervision. - Expanded reference material with new files covering MCP tool definitions, status management, and observability. - Clarified the file reading order, stricter context window management (40% rule), and reinforced generation-review role separation. - Updated guides and references for Claude Code and OpenClaw environments; skill is now explicitly tailored for these runtimes.

v2.0.2

fix: sensitive data leaking into logs/MEMORY.md, and unvetted web search content written directly into the repo

v2.0.1

- Updated documentation: added a "Safe Start Guide" section in SKILL.md with onboarding, usage, and safety instructions. - Clarified workflow: emphasized branch protection, manual pull request (PR) approvals, and tool auditing. - Noted removal: auto-merge is now always disabled; PRs require human approval. - No functional code changes—this update is focused on documentation and template file improvements for safer and clearer usage.

v2.0.0

**Major rewrite — skill now delivers a persistent, autonomous multi-agent runtime for self-improving codebases.** - Introduces a recursive, continuous improvement loop: UNDERSTAND → DOCUMENT → PLAN → BUILD → VERIFY → REFLECT → IMPROVE. - Adds formal in-repo memory and runtime config (CONFIG.yaml, MEMORY.md) for persistent context and parameters. - Modularizes agents by specialization in `agents/` directory (architect, dispatcher, debugger, etc.) and corresponding docs. - Defines strict, non-negotiable engineering rules and optimization priorities, enforced at every loop cycle. - Provides detailed protocols for task decomposition, agent spawning, tool execution, and loop progression via new reference and template files. - Removes the previous monolithic markdown instructions in favor of composable documents and a doc-driven workflow.

v1.0.0

- Initial release of the "harness-engineer" skill for transforming codebases into agent-first development environments. - Enables analysis of an existing project structure and documentation to generate harness-compliant markdown files (AGENTS.md, docs/architecture.md, docs/domains.md, docs/quality.md, docs/golden-principles.md, docs/plans/). - Outlines a structured multi-phase workflow: analyze the codebase, generate harness engineering documentation, then spawn Claude Code agents using clear contracts to implement or refactor features. - Provides templates and detailed documentation standards for agent onboarding, architecture mapping, domain boundaries, quality tracking, and golden principles enforcement. - Supports recursive agent execution loops with automated self-review and convergence checks to ensure code correctness and project coherence.

元数据

Slug harness-engineer

版本 5.3.0

许可证 MIT-0

累计安装 4

当前安装数 4

历史版本数 17

常见问题