ACP Protocol: Using Claude Code, Codex and Gemini CLI as External Harnesses
Chapter 16: The ACP Protocol — Invoking Claude Code, Codex, and Gemini CLI as External Harnesses
Chapter Overview
When the complexity of a programming task exceeds what a single model conversation can handle, OpenClaw provides a protocol layer called ACP (Agent Client Protocol). This allows OpenClaw to act as an orchestrator, delegating concrete code-execution work to specialized external agents called Harnesses. This chapter explores ACP's architectural design, the supported Harness catalog, core operation commands, and when to choose ACP over other integration approaches.
16.1 ACP's Architectural Role
Orchestration Layer vs. Runtime
ACP establishes two clearly delineated roles:
| Layer | Role | Owned Capabilities |
|---|---|---|
| OpenClaw (orchestration) | Scheduler, coordinator | Routing decisions / background task state / Delivery mode / Bindings / Policy enforcement |
| Harness (runtime) | Executor, specialist | Provider credentials / model catalog access / filesystem behavior / Native tools (terminal, code runners) |
The core value of this separation is that OpenClaw never needs to know how a Harness works internally. It only cares about "is the task done? what is the output?"—delegating all execution details to whichever Harness is best suited to the job.
Why Not Have OpenClaw Write Code Directly?
OpenClaw is a general-purpose orchestration engine designed for cross-domain coordination, not deep code engineering. For the following scenarios, a dedicated Harness has a clear edge:
- Large multi-file refactors: require simultaneous understanding of dozens of files while maintaining cross-file consistency
- Complex C# / Java projects: type-system reasoning benefits from dedicated language-server integration
- CI-loop tasks: require running tests, waiting for CI results, and iteratively fixing failures
Empirical data shows a 25–35% accuracy improvement when routing such tasks to a dedicated Harness. The underlying reasons are analyzed in section 16.8.
16.2 The 10 Supported Harnesses
OpenClaw officially supports the following Harnesses, each with distinct technical characteristics:
| Harness | Developer | Model Access | Best For | Notes |
|---|---|---|---|---|
| Claude Code | Anthropic | API Key / Claude.ai | Multi-file understanding, long-context refactors | Deepest integration with OpenClaw ecosystem |
| Codex | OpenAI | API Key | Fast completion, single-file tasks | Fast; low token cost |
| GitHub Copilot | GitHub/MS | GitHub account | In-IDE suggestions, PR review | Requires GitHub authorization |
| Cursor | Anysphere | Cursor account | Multi-file agent inside the editor | Composer mode supports deep refactors |
| Droid | Community | Configurable | Mobile-focused code tasks | Lightweight; resource-constrained environments |
| Gemini CLI | Google account/API | Ultra-long context (1M tokens) | Largest context window available | |
| OpenCode | Community | Configurable | Open-source alternative | Fully local deployment |
| Qwen | Alibaba Cloud | API Key | Chinese codebase tasks, Chinese comments | Best Chinese language understanding |
| Kimi | Moonshot AI | API Key | Long-document code tasks | Long-context Chinese optimization |
| iFlow / Kilo Code | Community | Configurable | Workflow-driven code generation | Suited for CI/CD integration |
Harness Selection Decision Tree
Does the task involve a Chinese-language codebase or comments?
├─ Yes → Prefer Qwen or Kimi
└─ No → More than 20 files?
├─ Yes → Context priority: Gemini CLI (1M tokens)
│ Accuracy priority: Claude Code
└─ No → Speed priority: Codex
Integration priority: Claude Code
16.3 Core Operations: spawn / status / close
spawn: Start a Harness Instance
# Basic syntax
/acp spawn <harness-name> [--bind <path>] [--cwd <dir>] [--thread <mode>]
# Example 1: bind current directory to Claude Code
/acp spawn claude --bind here --cwd /workspace/repo
# Example 2: launch Codex in background with auto thread management
/acp spawn codex --thread auto
# Example 3: specify a working directory for Gemini CLI
/acp spawn gemini --cwd /workspace/big-monorepo --thread new
Parameter Reference:
| Parameter | Description | Default |
|---|---|---|
--bind here |
Bind current directory as the Harness working root | None (must be explicit) |
--bind <path> |
Specify an absolute path to bind | None |
--cwd <dir> |
Set the Harness process's initial working directory | Inherits OpenClaw's current directory |
--thread auto |
Reuse an existing thread or create a new one | new |
--thread new |
Force creation of a fresh conversation thread | — |
status: Query Task State
# List all active ACP sessions
/acp status
# Query a specific session
/acp status --id <session-id>
Sample output:
ACP Sessions:
[abc123] claude RUNNING task: "Refactor auth module" elapsed: 4m32s
[def456] codex COMPLETE task: "Generate unit tests" elapsed: 1m15s
[ghi789] gemini WAITING task: "Analyze 80k-line codebase" elapsed: 12m01s
close: Terminate a Session
# Graceful shutdown (waits for current task to finish)
/acp close <session-id>
# Force terminate immediately
/acp close <session-id> --force
16.4 Interactive vs. Background Delivery Modes
ACP's Delivery Mode controls how a Harness returns results to the user.
Interactive Mode (real-time streaming)
/acp spawn claude --bind here --delivery interactive
- Harness output streams in real time back to the OpenClaw conversation UI
- Users can ask questions or redirect the task while it is running
- Best for tasks requiring frequent human decisions ("ask me when requirements are ambiguous")
- Trade-off: occupies the foreground context; no parallel tasks possible
Typical use cases:
- Live API design discussions
- Database schema migrations requiring real-time author confirmation
- Code review sessions where the author must be present to explain intent
Background Mode (announce on completion)
/acp spawn codex --bind here --delivery background
- Harness runs independently in the background; OpenClaw is non-blocking
- On completion, the announce mechanism notifies the user (notification + summary)
- Multiple Background Harnesses can run concurrently
- Best for: long-running tasks (>5 minutes) that require no live intervention
Typical use cases:
- Generating unit tests for an entire module (may take 10–20 minutes)
- Analyzing dependency graphs in large codebases
- Bulk rename/refactor across 100+ files
Mode Selection Matrix
| Scenario | Recommended Mode |
|---|---|
| Task duration < 2 minutes | Interactive |
| Mid-task decisions needed | Interactive |
| Task duration > 5 minutes | Background |
| Parallel tasks desired | Background |
| Real-time feedback required | Interactive |
| Batch processing tasks | Background |
16.5 Bindings: Directory Binding Mechanism
Bindings are one of the most important concepts in ACP. They determine which filesystem paths a Harness can access.
# Bind the current directory (most common)
/acp spawn claude --bind here
# Bind a specific directory
/acp spawn claude --bind /workspace/backend
# Bind multiple directories (comma-separated)
/acp spawn claude --bind /workspace/frontend,/workspace/backend
Security implications:
- A Harness can only access paths within its bindings
- Files outside bindings are invisible to the Harness, even if the Harness itself has broad filesystem capabilities
- OpenClaw's Policy layer can further restrict read/write permissions within bindings
16.6 Hands-On Case Study: Using Claude Code for a Large Multi-File Refactor
Scenario
A Node.js project needs all callback-style async functions migrated to async/await. The work spans 47 files and roughly 12,000 lines of code.
Step 1: Plan and Launch
# Start Claude Code in background mode from the project root
/acp spawn claude \
--bind here \
--cwd /workspace/node-project \
--delivery background \
--thread new
Step 2: Send the Task Description
[ACP → Claude Code]
Task: Migrate all callback-style async functions to async/await across the entire codebase.
Constraints:
- Do NOT modify test files (*.test.js, *.spec.js)
- Preserve all existing JSDoc comments
- Run `npm test` after each file to verify no regressions
- If a file has > 3 failures, pause and report back
Start with src/api/, then src/services/, finally src/utils/
Step 3: Monitor Progress
/acp status
# Output:
# [abc123] claude RUNNING task: "Callback→async/await migration"
# Progress: 12/47 files complete, 0 test failures
# Current: src/services/payment.service.js
# Elapsed: 8m45s
Step 4: Receive the Result
When the task finishes, OpenClaw displays an announce:
✅ ACP Task Complete [abc123]
Harness: Claude Code
Duration: 34m12s
Files modified: 47
Test results: 284 passed, 0 failed
Summary: Successfully migrated all callback-style functions to async/await.
Notable: Found 3 cases where callback error handling was incomplete — added proper try/catch blocks.
16.7 Three-Way Comparison: ACP vs. Sub-agents vs. Native Plugins
These three integration approaches frequently cause confusion. Here is the authoritative comparison:
ACP (External Harness)
OpenClaw ──ACP protocol──▶ Separate process (Claude Code / Codex / etc.)
↑
Runs OUTSIDE OpenClaw's sandbox
Has full filesystem access
Can execute arbitrary shell commands
- NOT subject to OpenClaw's sandbox
- Has independent model access, filesystem, and network access
- Best for: programming tasks requiring full system access
Sub-agents (Internal Sub-agents)
OpenClaw ──internal dispatch──▶ Sub-agent
↑
Subject to OpenClaw's sandbox
Limited to OpenClaw-authorized tools
Shares the parent's context budget
- Subject to OpenClaw's sandbox
- Shares parent agent's context budget and permissions
- Best for: task decomposition, parallel queries, data aggregation
Native Plugins
OpenClaw ──plugin API──▶ Extension module
↑
Runs inside the OpenClaw process
Accesses OpenClaw internal APIs
Does NOT launch an independent Agent
- Not an agent — a capability extension
- Best for: adding new tools, commands, or UI components
Summary Comparison Table
| Dimension | ACP | Sub-agents | Native Plugin |
|---|---|---|---|
| Independent process | ✅ Yes | ❌ No | ❌ No |
| Sandbox constraints | ❌ Unconstrained | ✅ Constrained | ✅ Constrained |
| Filesystem access | Full | Restricted | Restricted |
| Independent model access | ✅ Yes | ❌ No | ❌ No |
| Best task type | Complex coding | Task decomposition | Feature extension |
| Startup overhead | High (seconds) | Low (milliseconds) | Minimal |
| Error isolation | Fully isolated | Partially isolated | None |
16.8 Why Accuracy Improves 25–35%: The Four Mechanisms
Empirical results consistently show a 25–35% accuracy improvement when complex coding tasks are routed to a dedicated Harness. Four mechanisms explain this:
1. Dedicated Context Window (Most Important Factor)
When Claude Code runs as a Harness, it has an independent, clean context window that is not consumed by OpenClaw's conversation history, Skill content, or tool definitions. For a 12,000-line codebase this means:
- Direct OpenClaw handling: usable code context ≈ total window − system prompt − conversation history ≈ 60–70%
- Claude Code Harness: usable code context ≈ 95%+
2. Specialized Toolset
A Harness's native tools (e.g., Claude Code's read_file, write_file, bash) are optimized for code engineering and outperform generic OpenClaw tool calls in both speed and precision.
3. Self-Repair Loop Capability
A Harness can run code, observe results, fix errors, and iterate — forming an autonomous repair loop:
Write code → Run tests → Analyze failures → Fix → Run tests again → ...
OpenClaw itself has no persistent execution environment to support this pattern.
4. Model Specialization
Different Harnesses use different underlying models, enabling task-type-optimal model selection:
- Gemini CLI → ultra-long-context tasks
- Claude Code → reasoning-intensive tasks
- Codex → high-frequency, low-complexity completions
16.9 ACP Policy and Security Controls
OpenClaw's Policy layer remains active during ACP calls:
# Example ACP policy configuration in openclaw.json
acp:
allowed_harnesses:
- claude
- codex
default_delivery: background
max_concurrent_sessions: 3
binding_restrictions:
- allow: /workspace/**
- deny: /workspace/secrets/**
Key security principles:
- OpenClaw cannot constrain a Harness's internal behavior, but it can restrict which Harnesses are permitted to launch
- Binding paths are the primary security boundary OpenClaw can enforce
- In production, explicitly listing
allowed_harnessesprevents users from launching unaudited Harnesses
16.10 Chapter Summary
- ACP positions OpenClaw as an orchestration layer, delegating code execution to specialized Harnesses
- Ten Harnesses are available, each suited to different task types and context needs
spawn / status / closeare ACP's three core commands- Interactive vs. Background delivery depends on task duration and whether live intervention is needed
- ACP Harnesses are not subject to OpenClaw's sandbox — the key distinction from Sub-agents
- The 25–35% accuracy improvement comes primarily from dedicated context windows and autonomous repair loops
The next chapter dives into the Skills system's internals: the SKILL.md format, the lazy-loading mechanism, and how description becomes the model's activation trigger.