Chapter 16

ACP Protocol: Using Claude Code, Codex and Gemini CLI as External Harnesses

Chapter 16: The ACP Protocol — Invoking Claude Code, Codex, and Gemini CLI as External Harnesses

Chapter Overview

When the complexity of a programming task exceeds what a single model conversation can handle, OpenClaw provides a protocol layer called ACP (Agent Client Protocol). This allows OpenClaw to act as an orchestrator, delegating concrete code-execution work to specialized external agents called Harnesses. This chapter explores ACP's architectural design, the supported Harness catalog, core operation commands, and when to choose ACP over other integration approaches.

16.1 ACP's Architectural Role

Orchestration Layer vs. Runtime

ACP establishes two clearly delineated roles:

Layer	Role	Owned Capabilities
OpenClaw (orchestration)	Scheduler, coordinator	Routing decisions / background task state / Delivery mode / Bindings / Policy enforcement
Harness (runtime)	Executor, specialist	Provider credentials / model catalog access / filesystem behavior / Native tools (terminal, code runners)

The core value of this separation is that OpenClaw never needs to know how a Harness works internally. It only cares about "is the task done? what is the output?"—delegating all execution details to whichever Harness is best suited to the job.

Why Not Have OpenClaw Write Code Directly?

OpenClaw is a general-purpose orchestration engine designed for cross-domain coordination, not deep code engineering. For the following scenarios, a dedicated Harness has a clear edge:

Large multi-file refactors: require simultaneous understanding of dozens of files while maintaining cross-file consistency
Complex C# / Java projects: type-system reasoning benefits from dedicated language-server integration
CI-loop tasks: require running tests, waiting for CI results, and iteratively fixing failures

Empirical data shows a 25–35% accuracy improvement when routing such tasks to a dedicated Harness. The underlying reasons are analyzed in section 16.8.

16.2 The 10 Supported Harnesses

OpenClaw officially supports the following Harnesses, each with distinct technical characteristics:

Harness	Developer	Model Access	Best For	Notes
Claude Code	Anthropic	API Key / Claude.ai	Multi-file understanding, long-context refactors	Deepest integration with OpenClaw ecosystem
Codex	OpenAI	API Key	Fast completion, single-file tasks	Fast; low token cost
GitHub Copilot	GitHub/MS	GitHub account	In-IDE suggestions, PR review	Requires GitHub authorization
Cursor	Anysphere	Cursor account	Multi-file agent inside the editor	Composer mode supports deep refactors
Droid	Community	Configurable	Mobile-focused code tasks	Lightweight; resource-constrained environments
Gemini CLI	Google	Google account/API	Ultra-long context (1M tokens)	Largest context window available
OpenCode	Community	Configurable	Open-source alternative	Fully local deployment
Qwen	Alibaba Cloud	API Key	Chinese codebase tasks, Chinese comments	Best Chinese language understanding
Kimi	Moonshot AI	API Key	Long-document code tasks	Long-context Chinese optimization
iFlow / Kilo Code	Community	Configurable	Workflow-driven code generation	Suited for CI/CD integration

Harness Selection Decision Tree

Does the task involve a Chinese-language codebase or comments?
  ├─ Yes → Prefer Qwen or Kimi
  └─ No  → More than 20 files?
             ├─ Yes → Context priority: Gemini CLI (1M tokens)
             │        Accuracy priority: Claude Code
             └─ No  → Speed priority: Codex
                       Integration priority: Claude Code

16.3 Core Operations: spawn / status / close

spawn: Start a Harness Instance

# Basic syntax
/acp spawn <harness-name> [--bind <path>] [--cwd <dir>] [--thread <mode>]

# Example 1: bind current directory to Claude Code
/acp spawn claude --bind here --cwd /workspace/repo

# Example 2: launch Codex in background with auto thread management
/acp spawn codex --thread auto

# Example 3: specify a working directory for Gemini CLI
/acp spawn gemini --cwd /workspace/big-monorepo --thread new

Parameter Reference:

Parameter	Description	Default
`--bind here`	Bind current directory as the Harness working root	None (must be explicit)
`--bind <path>`	Specify an absolute path to bind	None
`--cwd <dir>`	Set the Harness process's initial working directory	Inherits OpenClaw's current directory
`--thread auto`	Reuse an existing thread or create a new one	`new`
`--thread new`	Force creation of a fresh conversation thread	—

status: Query Task State

# List all active ACP sessions
/acp status

# Query a specific session
/acp status --id <session-id>

Sample output:

ACP Sessions:
  [abc123] claude  RUNNING   task: "Refactor auth module"  elapsed: 4m32s
  [def456] codex   COMPLETE  task: "Generate unit tests"   elapsed: 1m15s
  [ghi789] gemini  WAITING   task: "Analyze 80k-line codebase" elapsed: 12m01s

close: Terminate a Session

# Graceful shutdown (waits for current task to finish)
/acp close <session-id>

# Force terminate immediately
/acp close <session-id> --force

16.4 Interactive vs. Background Delivery Modes

ACP's Delivery Mode controls how a Harness returns results to the user.

Interactive Mode (real-time streaming)

/acp spawn claude --bind here --delivery interactive

Harness output streams in real time back to the OpenClaw conversation UI
Users can ask questions or redirect the task while it is running
Best for tasks requiring frequent human decisions ("ask me when requirements are ambiguous")
Trade-off: occupies the foreground context; no parallel tasks possible

Typical use cases:

Live API design discussions
Database schema migrations requiring real-time author confirmation
Code review sessions where the author must be present to explain intent

Background Mode (announce on completion)

/acp spawn codex --bind here --delivery background

Harness runs independently in the background; OpenClaw is non-blocking
On completion, the announce mechanism notifies the user (notification + summary)
Multiple Background Harnesses can run concurrently
Best for: long-running tasks (>5 minutes) that require no live intervention

Typical use cases:

Generating unit tests for an entire module (may take 10–20 minutes)
Analyzing dependency graphs in large codebases
Bulk rename/refactor across 100+ files

Mode Selection Matrix

Scenario	Recommended Mode
Task duration < 2 minutes	Interactive
Mid-task decisions needed	Interactive
Task duration > 5 minutes	Background
Parallel tasks desired	Background
Real-time feedback required	Interactive
Batch processing tasks	Background

16.5 Bindings: Directory Binding Mechanism

Bindings are one of the most important concepts in ACP. They determine which filesystem paths a Harness can access.

# Bind the current directory (most common)
/acp spawn claude --bind here

# Bind a specific directory
/acp spawn claude --bind /workspace/backend

# Bind multiple directories (comma-separated)
/acp spawn claude --bind /workspace/frontend,/workspace/backend

Security implications:

A Harness can only access paths within its bindings
Files outside bindings are invisible to the Harness, even if the Harness itself has broad filesystem capabilities
OpenClaw's Policy layer can further restrict read/write permissions within bindings

16.6 Hands-On Case Study: Using Claude Code for a Large Multi-File Refactor

Scenario

A Node.js project needs all callback-style async functions migrated to async/await. The work spans 47 files and roughly 12,000 lines of code.

Step 1: Plan and Launch

# Start Claude Code in background mode from the project root
/acp spawn claude \
  --bind here \
  --cwd /workspace/node-project \
  --delivery background \
  --thread new

Step 2: Send the Task Description

[ACP → Claude Code]
Task: Migrate all callback-style async functions to async/await across the entire codebase.

Constraints:
- Do NOT modify test files (*.test.js, *.spec.js)
- Preserve all existing JSDoc comments
- Run `npm test` after each file to verify no regressions
- If a file has > 3 failures, pause and report back

Start with src/api/, then src/services/, finally src/utils/

Step 3: Monitor Progress

/acp status
# Output:
# [abc123] claude  RUNNING  task: "Callback→async/await migration"
#   Progress: 12/47 files complete, 0 test failures
#   Current: src/services/payment.service.js
#   Elapsed: 8m45s

Step 4: Receive the Result

When the task finishes, OpenClaw displays an announce:

✅ ACP Task Complete [abc123]
Harness: Claude Code
Duration: 34m12s
Files modified: 47
Test results: 284 passed, 0 failed
Summary: Successfully migrated all callback-style functions to async/await.
Notable: Found 3 cases where callback error handling was incomplete — added proper try/catch blocks.

16.7 Three-Way Comparison: ACP vs. Sub-agents vs. Native Plugins

These three integration approaches frequently cause confusion. Here is the authoritative comparison:

ACP (External Harness)

OpenClaw ──ACP protocol──▶ Separate process (Claude Code / Codex / etc.)
                             ↑
                        Runs OUTSIDE OpenClaw's sandbox
                        Has full filesystem access
                        Can execute arbitrary shell commands

NOT subject to OpenClaw's sandbox
Has independent model access, filesystem, and network access
Best for: programming tasks requiring full system access

Sub-agents (Internal Sub-agents)

OpenClaw ──internal dispatch──▶ Sub-agent
                                  ↑
                             Subject to OpenClaw's sandbox
                             Limited to OpenClaw-authorized tools
                             Shares the parent's context budget

Subject to OpenClaw's sandbox
Shares parent agent's context budget and permissions
Best for: task decomposition, parallel queries, data aggregation

Native Plugins

OpenClaw ──plugin API──▶ Extension module
                          ↑
                     Runs inside the OpenClaw process
                     Accesses OpenClaw internal APIs
                     Does NOT launch an independent Agent

Not an agent — a capability extension
Best for: adding new tools, commands, or UI components

Summary Comparison Table

Dimension	ACP	Sub-agents	Native Plugin
Independent process	✅ Yes	❌ No	❌ No
Sandbox constraints	❌ Unconstrained	✅ Constrained	✅ Constrained
Filesystem access	Full	Restricted	Restricted
Independent model access	✅ Yes	❌ No	❌ No
Best task type	Complex coding	Task decomposition	Feature extension
Startup overhead	High (seconds)	Low (milliseconds)	Minimal
Error isolation	Fully isolated	Partially isolated	None

16.8 Why Accuracy Improves 25–35%: The Four Mechanisms

Empirical results consistently show a 25–35% accuracy improvement when complex coding tasks are routed to a dedicated Harness. Four mechanisms explain this:

1. Dedicated Context Window (Most Important Factor)

When Claude Code runs as a Harness, it has an independent, clean context window that is not consumed by OpenClaw's conversation history, Skill content, or tool definitions. For a 12,000-line codebase this means:

Direct OpenClaw handling: usable code context ≈ total window − system prompt − conversation history ≈ 60–70%
Claude Code Harness: usable code context ≈ 95%+

2. Specialized Toolset

A Harness's native tools (e.g., Claude Code's read_file, write_file, bash) are optimized for code engineering and outperform generic OpenClaw tool calls in both speed and precision.

3. Self-Repair Loop Capability

A Harness can run code, observe results, fix errors, and iterate — forming an autonomous repair loop:

Write code → Run tests → Analyze failures → Fix → Run tests again → ...

OpenClaw itself has no persistent execution environment to support this pattern.

4. Model Specialization

Different Harnesses use different underlying models, enabling task-type-optimal model selection:

Gemini CLI → ultra-long-context tasks
Claude Code → reasoning-intensive tasks
Codex → high-frequency, low-complexity completions

16.9 ACP Policy and Security Controls

OpenClaw's Policy layer remains active during ACP calls:

# Example ACP policy configuration in openclaw.json
acp:
  allowed_harnesses:
    - claude
    - codex
  default_delivery: background
  max_concurrent_sessions: 3
  binding_restrictions:
    - allow: /workspace/**
    - deny: /workspace/secrets/**

Key security principles:

OpenClaw cannot constrain a Harness's internal behavior, but it can restrict which Harnesses are permitted to launch
Binding paths are the primary security boundary OpenClaw can enforce
In production, explicitly listing allowed_harnesses prevents users from launching unaudited Harnesses

16.10 Chapter Summary

ACP positions OpenClaw as an orchestration layer, delegating code execution to specialized Harnesses
Ten Harnesses are available, each suited to different task types and context needs
spawn / status / close are ACP's three core commands
Interactive vs. Background delivery depends on task duration and whether live intervention is needed
ACP Harnesses are not subject to OpenClaw's sandbox — the key distinction from Sub-agents
The 25–35% accuracy improvement comes primarily from dedicated context windows and autonomous repair loops

The next chapter dives into the Skills system's internals: the SKILL.md format, the lazy-loading mechanism, and how description becomes the model's activation trigger.

Rate this chapter

4.5 / 5 (18 ratings)