Chapter 35

Sub-Agents: Non-Blocking Delegation, Cross-Agent Memory Search and Structured Communication

Chapter 35: Sub-agents — Non-Blocking Delegated Execution, Cross-Agent Memory Search, and Structured Communication

"Sub-agents transform the main Agent from 'doing everything itself' to 'project manager' — assigning tasks, monitoring progress, integrating results." — OpenClaw Sub-agents Design Document

35.1 The Core Design Philosophy of Sub-agents

In Chapter 34, we explored how to route different incoming messages to different Agents through Bindings. Sub-agents solve a different problem: when an Agent processing a task needs to delegate some sub-tasks to another specialized Agent to complete.

These are two different concurrency patterns:

Multi-Agent routing (Chapter 34):       Sub-agents (this chapter):

Message A ──→ Agent 1                  Main Agent receives request
Message B ──→ Agent 2                       ↓
Message C ──→ Agent 3                   Decompose into sub-tasks
                                            ↓
(Different messages, different Agents)  ┌────┴────┬─────┐
                                       Sub1      Sub2     Sub3
                                       Task A   Task B   Task C
                                          └────────┴────┘
                                               ↓
                                       Aggregate results, main Agent replies

35.2 sessions_spawn: Non-Blocking Mechanism in Detail

35.2.1 The Fundamental Difference Between Blocking and Non-Blocking

Blocking call (traditional approach):

Main Agent starts task
    ↓
Call sub-task, wait for completion
    ↓ (wait 10 minutes...)
Sub-task completes, continue to next step
    ↓
Call another sub-task, wait for completion
    ↓ (wait 8 minutes...)
...
Total time = sum of all sub-task times

Non-blocking call (sessions_spawn):

Main Agent starts task
    ↓
Spawn sub-task 1 → immediately return runId1, continue executing
    ↓
Spawn sub-task 2 → immediately return runId2, continue executing
    ↓
Spawn sub-task 3 → immediately return runId3, continue executing
    ↓
(Sub-tasks 1, 2, 3 run in parallel)
    ↓
Wait for all sub-tasks to complete (configurable timeout)
    ↓
Aggregate results
Total time ≈ longest sub-task's time (not the sum of all)

35.2.2 The sessions_spawn Call Interface

In an Agent's tool calls, sessions_spawn is used as follows:

// Tool call issued by main Agent
{
  "tool": "sessions_spawn",
  "parameters": {
    "agentId": "research-agent",     // Agent ID to delegate to
    "task": "Search for academic papers on LLM security in the past 30 days, return summaries of the top 10",
    "context": {                     // Context passed to sub-Agent
      "keywords": ["LLM security", "prompt injection", "jailbreak"],
      "dateRange": "2026-03-01 to 2026-04-01",
      "outputFormat": "markdown-list"
    },
    "timeout": 300,                  // Timeout in seconds (300s = 5 minutes)
    "priority": "normal"             // "high" | "normal" | "low"
  }
}

Immediately returned result:

{
  "runId": "sub-7f8a9b2c-4d3e-11ec-81d3-0242ac130003",
  "status": "spawned",
  "estimatedDuration": 240,
  "agentId": "research-agent",
  "createdAt": "2026-04-26T10:00:00Z"
}

After the main Agent receives the runId, it can continue handling other work — no waiting needed.

35.2.3 Concurrent Spawn Example: Research Report Generation

Here's a complete example of a main Agent simultaneously spawning multiple sub-Agents to generate a research report in parallel:

Main Agent receives request:
"Generate a complete report on generative AI applications in healthcare,
 covering four sections: technical status, regulatory environment,
 market data, and case studies"

Main Agent processing logic (pseudocode):

# Main Agent decomposes task and concurrently spawns sub-Agents
tasks = [
    {
        "agentId": "tech-researcher",
        "task": "Analyze the technical status of generative AI in healthcare:\n"
                "1. Current major application scenarios (diagnostic assistance, drug discovery, imaging)\n"
                "2. Technical maturity of each scenario\n"
                "3. Major technical challenges and breakthroughs"
    },
    {
        "agentId": "legal-researcher",
        "task": "Analyze the global regulatory environment for generative AI in healthcare:\n"
                "1. FDA, EMA, NMPA AI medical device regulatory frameworks\n"
                "2. Significant regulatory changes in 2025-2026\n"
                "3. Impact of compliance requirements on AI development"
    },
    {
        "agentId": "market-analyst",
        "task": "Collect generative AI healthcare market data including:\n"
                "1. Market size (2024-2026)\n"
                "2. Major players and market share\n"
                "3. Investment and funding trends"
    },
    {
        "agentId": "case-researcher",
        "task": "Collect 3-5 real-world cases of generative AI healthcare applications, each including:\n"
                "company name, application scenario, technical approach, outcome data, lessons learned"
    }
]

# Concurrent spawn
run_ids = []
for task in tasks:
    result = await sessions_spawn(
        agentId=task["agentId"],
        task=task["task"],
        timeout=300
    )
    run_ids.append(result["runId"])

# Main Agent can now do other work (e.g., inform user of processing)
await send_message("Researching four dimensions in parallel, estimated completion in 5 minutes...")

# Wait for all sub-tasks to complete (timeout set to 360 seconds)
results = await wait_for_runs(run_ids, timeout=360)

# Aggregate results
final_report = synthesize_report(results)
await send_message(final_report)

Time efficiency comparison:

Serial execution (blocking):
  Tech research: 240 seconds
  Regulatory analysis: 180 seconds
  Market data: 150 seconds
  Case research: 200 seconds
  Total: 770 seconds (~13 minutes)

Parallel execution (sessions_spawn):
  Four tasks running in parallel
  Total: ~240 seconds (longest task)
  Efficiency improvement: ~3.2x

35.3 Sub-Agent Isolated Sessions and Restricted Tool Access

35.3.1 Sub-Agent Session Isolation

Each sub-Agent instance created through sessions_spawn has its own independent Session:

Main Agent Session (session-main-abc123)
  ├── Conversation history
  ├── Tool call records
  └── Current task state
  
Sub-Agent Session (session-sub-def456)  ← Independent Session
  ├── task and context passed at spawn time
  ├── Sub-Agent's execution history
  └── Sub-task results
  
Sub-Agent Session (session-sub-ghi789)  ← Another independent Session
  ├── Different task and context
  └── Independent execution history

Sub-Agents cannot access the main Agent's conversation history — they can only see the task and context passed in at sessions_spawn time.

35.3.2 Sub-Agent Tool Permission Inheritance

Sub-Agents by default inherit the main Agent's tool permissions, but can be further restricted (cannot be expanded):

// Main Agent configuration
{
  "tools": {
    "allow": ["read", "write", "browser.search", "python"]
  }
}

// Restrict tool permissions when spawning sub-Agent
{
  "tool": "sessions_spawn",
  "parameters": {
    "agentId": "research-agent",
    "task": "...",
    "toolRestrictions": {
      "allow": ["browser.search"],     // Only allow browser.search
      "deny": ["read", "write", "python"]  // Explicitly deny other tools
    }
  }
}

The significance of this design: even if a sub-Agent is infected by a malicious Skill, the damage it can cause is limited to its tool permission scope.

35.4 Automatic Result Announcement Mechanism

35.4.1 Notification Flow When Sub-Agent Completes

When a sub-Agent completes a task, results are automatically passed to the main Agent through the following mechanism:

Sub-Agent completes task
    ↓
Write results to shared Result Buffer
    ↓
Notify main Agent (event push)
    ↓
Main Agent reads result from Result Buffer
    ↓
Continue subsequent processing

35.4.2 How Main Agent Receives Results

The main Agent can receive sub-Agent results in two ways:

Method 1: Poll-wait (recommended for known number of sub-tasks)

{
  "tool": "sessions_wait",
  "parameters": {
    "runIds": ["sub-7f8a9b2c", "sub-8a9b3c4d", "sub-9b0c4d5e"],
    "waitMode": "all",              // "all"=wait for all, "any"=wait for any one
    "timeout": 360,
    "onTimeout": "return-partial"  // "fail" | "return-partial"
  }
}

Method 2: Event-driven (recommended for unknown number of sub-tasks)

{
  "tool": "sessions_on_complete",
  "parameters": {
    "runId": "sub-7f8a9b2c",
    "callback": {
      "action": "notify_main",
      "message": "research-task-1-done"
    }
  }
}

35.5 Structured Agent-to-Agent Communication (New in 2026.2.17)

The OpenClaw version released on February 17, 2026 introduced deterministic sub-Agent generation and structured inter-Agent communication.

35.5.1 Deterministic Sub-Agent Generation

Before this version, sessions_spawn created a new sub-Agent instance on every call with no fixed Agent ID. Version 2026.2.17 introduced deterministic agentId binding:

{
  "tool": "sessions_spawn",
  "parameters": {
    "agentId": "research-agent",
    "deterministicId": "report-2026-q1-research",  // New: deterministic ID
    "task": "...",
    "deduplication": true  // If same-ID task is already running, don't create duplicate
  }
}

This solves the idempotency problem: if the main Agent retries due to a network issue, the same task won't be executed twice.

35.5.2 Structured Message Passing Protocol

The new version defines a structured message format for inter-Agent communication:

// Inter-Agent message format (IPC Message)
{
  "messageId": "msg-a1b2c3d4",
  "from": {
    "agentId": "research-agent",
    "runId": "sub-7f8a9b2c"
  },
  "to": {
    "agentId": "main",
    "sessionId": "session-main-abc123"
  },
  "type": "task_result",          // "task_result" | "progress" | "error" | "query"
  "payload": {
    "status": "completed",
    "data": {
      "summary": "Found 47 relevant papers...",
      "papers": [...]
    },
    "metadata": {
      "duration": 218,
      "toolCallCount": 12,
      "tokensUsed": 8432
    }
  },
  "timestamp": "2026-04-26T10:04:58Z"
}

35.5.3 Bidirectional Communication: Sub-Agent Querying Main Agent

The new feature allows sub-Agents to send queries to the main Agent during execution, requesting clarification or additional information:

// Query message sent from sub-Agent to main Agent
{
  "type": "query",
  "payload": {
    "question": "When analyzing the regulatory framework, do you want me to focus on a specific region (US/EU/China)?",
    "options": ["US only", "EU only", "China only", "Global (takes longer)"],
    "timeout": 30,            // If no reply within 30 seconds, use default
    "default": "Global (takes longer)"
  }
}

The main Agent can choose to:

Reply to the query immediately
Wait for user input before replying
Let the timeout trigger the default value

35.6 Cross-Agent Memory Search Configuration

35.6.1 memorySearch.qmd.extraCollections Configuration

Sub-Agents by default can only access their own memory collections. By configuring memorySearch.qmd.extraCollections, sub-Agents can reach beyond their own boundaries when executing memory searches, accessing other Agents' memories:

// In the sub-Agent's configuration (or passed in at spawn time)
{
  "agents": {
    "list": [
      {
        "id": "research-agent",
        "memorySearch": {
          "qmd": {
            "enabled": true,
            "extraCollections": [
              {
                "agentId": "main",
                "collections": ["user-profile", "research-history"],
                "permission": "read-only"
              },
              {
                "agentId": "work",
                "collections": ["business-knowledge"],
                "permission": "read-only"
              }
            ]
          }
        }
      }
    ]
  }
}

35.6.2 Practical Applications of Cross-Agent Memory Search

Scenario: User previously told the main Agent "When researching healthcare AI,
I'm particularly focused on the Chinese market."
The main Agent stored this preference in memory.

When the main Agent spawns a research sub-Agent,
the sub-Agent accesses the main Agent's memory via extraCollections,
discovers this preference,
and automatically focuses on Chinese market data during research.

The user doesn't need to repeat their preference — cross-Agent memory sharing
automatically passes this context.

Memory Access Matrix (with Sub-agents)

              Main Agent  Sub-Research  Sub-Legal  Sub-Market
Main Agent       R/W          R/W(*)       R/W(*)     R/W(*)
Sub-Research     R(config)     R/W          -           -
Sub-Legal        R(config)      -           R/W          -
Sub-Market       R(config)      -            -          R/W

R/W = read/write, R = read-only (via explicit config), - = no access
(*) = main Agent writes sub-task results it chooses to keep

Principle: Sub-Agents can read the main Agent's memory (via explicit configuration) but cannot write to it. Sub-Agent task results are written to memory only if the main Agent decides to retain them.

35.7 Sub-agents vs ACP Harness vs Native Plugin: Three-Way Comparison

When you need to extend OpenClaw's capabilities, three different mechanisms are available. Understanding their differences is crucial for making correct architectural decisions.

35.7.1 Three-Way Comparison Table

Dimension	Sub-agents	ACP Harness	Native Plugin
Execution location	Inside OpenClaw	External Runtime	Inside OpenClaw process
Sandbox constraints	Subject to OpenClaw sandbox	Not subject to OpenClaw sandbox	Partial constraints
Tool access	Limited by main Agent's tool permissions	Independent tool access	Direct Pi framework access
Memory access	Configurable cross-Agent memory read	Independent memory system	Direct memory API access
Performance overhead	Low (in-process communication)	High (cross-process/network)	Minimal (direct call)
Development complexity	Low (JSON configuration)	High (implement ACP protocol)	High (requires Pi framework knowledge)
Use cases	Task decomposition / Parallel processing	Complex external integration	Core capability extension
Security boundary	OpenClaw security boundary	External security boundary	Minimal security boundary

35.7.2 Detailed Analysis

Sub-agents:

Best for:
+ Parallel execution of multiple related sub-tasks
+ Expert specialization (different Agents focusing on different domains)
+ Task isolation (prevent sub-task pollution of main Agent context)
+ Progressive result announcement

Not suitable for:
- Deep integration requiring access to external systems (databases, message queues)
- Tasks requiring persistent processes (background services)
- Operations needing permissions beyond OpenClaw's sandbox

ACP Harness (External Harness Runtime):

Best for:
+ Deep integration with enterprise internal systems (ERP, CRM, databases)
+ Operations that OpenClaw's sandbox doesn't permit (e.g., direct filesystem access)
+ Multi-framework usage (simultaneously using OpenClaw and other AI frameworks)
+ Building independent AI services that collaborate with OpenClaw via ACP protocol

Not suitable for:
- Simple task decomposition (Sub-agents suffice)
- Latency-sensitive scenarios (cross-process overhead is significant)

Native Plugin (deepest embedding):

Best for:
+ Extending OpenClaw's core capabilities (new memory backends, new channel adapters)
+ Performance-sensitive features (requiring direct memory access)
+ Deep customization of OpenClaw behavior (modifying Pi framework core logic)

Not suitable for:
- Most business logic extensions (Sub-agents more appropriate)
- Teams without OpenClaw internal development experience

35.7.3 Selection Decision Tree

Need to extend OpenClaw's capabilities?
    ↓
Need full OS permissions or deep external system integration?
    ├── Yes → Consider ACP Harness
    └── No  ↓
        Need to modify OpenClaw core behavior or performance limit optimization?
        ├── Yes → Consider Native Plugin
        └── No  → Use Sub-agents

35.8 Typical Multi-Agent Architecture: Main Agent Scheduling + Sub-Agent Parallel Processing

35.8.1 Architecture Diagram

User / External Channel
        ↓
   Main Coordinator Agent
   ├── Receive user request
   ├── Decompose task
   ├── Spawn sub-Agents
   ├── Monitor progress
   ├── Aggregate results
   └── Reply to user
        ↓
   ┌────┬────┬────┬────┐
  Sub-A Sub-B Sub-C Sub-D  ← Execute in parallel, unaware of each other
 (Research)(Analysis)(Writing)(Review)
     └────┴────┴────┘
           ↓
     Result aggregation (main Agent)
           ↓
     Final report → User

35.8.2 Complete YAML Task Configuration Example

# Task: Generate competitive analysis report
name: competitive-analysis-workflow
version: "1.0"

main_agent:
  id: coordinator
  model: claude-sonnet-4
  
workflow:
  - step: intake
    agent: coordinator
    action: |
      1. Parse the competitor list provided by user
      2. Confirm analysis dimensions (features/pricing/market/technology)
      3. Estimate completion time
      
  - step: parallel_research
    parallel: true
    sub_tasks:
      - agent: research-agent
        task: "Deep search public information on Competitor A: website/blog/GitHub/financials"
        timeout: 240
        
      - agent: research-agent
        task: "Deep search public information on Competitor B: website/blog/GitHub/financials"
        timeout: 240
        
      - agent: market-agent
        task: "Collect market data for Competitors A and B: downloads/users/funding"
        timeout: 180
        
  - step: synthesis
    agent: coordinator
    depends_on: parallel_research
    action: |
      1. Aggregate all sub-Agent research results
      2. Identify key differences and commonalities
      3. Generate structured comparison report
      
  - step: quality_check
    agent: critic-agent
    depends_on: synthesis
    action: |
      Review the report for accuracy, objectivity, and completeness
      Return revision suggestions
      
  - step: final_output
    agent: coordinator
    depends_on: quality_check
    action: |
      Revise report based on review feedback
      Format as Markdown/PDF
      Reply to user

35.9 Sub-agents Concurrency Limit and Task Design

35.9.1 Concurrency Limit: 8

OpenClaw's Sub-agents support a maximum of 8 concurrent sub-Agent instances (maximum number a single main Agent can spawn).

Design considerations for this limit:

Factor	Explanation
LLM API concurrency	Most LLM providers have rate limits on concurrent requests
Memory consumption	Each sub-Agent instance maintains independent context; memory overhead grows linearly
Result aggregation complexity	With more than 8 parallel results, the main Agent's aggregation quality degrades
Task decomposition granularity	8 is typically sufficient for effective task decomposition

35.9.2 Strategies for More Than 8 Tasks

# Strategy: Batching
async def process_large_task_list(tasks):
    BATCH_SIZE = 8
    all_results = []
    
    for i in range(0, len(tasks), BATCH_SIZE):
        batch = tasks[i:i + BATCH_SIZE]
        
        # Spawn current batch
        run_ids = [await sessions_spawn(task) for task in batch]
        
        # Wait for current batch to complete
        batch_results = await sessions_wait(run_ids, timeout=300)
        all_results.extend(batch_results)
        
        # Optional: report progress to user
        progress = min(i + BATCH_SIZE, len(tasks))
        await report_progress(f"Completed {progress}/{len(tasks)} sub-tasks")
    
    return all_results

35.9.3 Task Design Principles

Good task design:

+ Sub-tasks have no dependency on each other (can genuinely run in parallel)
+ Each sub-task has clear input and output format
+ Sub-tasks have similar expected run times (avoid bottleneck effect)
+ Sub-task outputs can be directly integrated by the main Agent

Poor task design:

- Sub-task B depends on sub-task A's output (should be serial, not parallel)
- Sub-task A takes 10 seconds, sub-task B takes 300 seconds (wasted parallel efficiency)
- Sub-task output formats are inconsistent (main Agent integration is difficult)
- Too many sub-Agents spawned for similar tasks (exceeds actual necessity)

35.10 Error Handling and Timeout Configuration

35.10.1 Sub-Agent Error Types

Error types:
1. timeout         → Sub-task did not complete within specified time
2. tool_error      → A tool called by the sub-Agent returned an error
3. llm_error       → LLM API call failed (rate limit/network)
4. context_overflow → Sub-task content exceeds model's context window
5. spawn_limit     → Attempted to spawn a 9th sub-Agent

35.10.2 Complete Error Handling Configuration

{
  "tool": "sessions_spawn",
  "parameters": {
    "agentId": "research-agent",
    "task": "...",
    "timeout": 300,
    "retryPolicy": {
      "maxRetries": 2,
      "retryOn": ["llm_error", "tool_error"],
      "backoffSeconds": 5
    },
    "onError": {
      "action": "return-partial",   // "fail" | "return-partial" | "use-fallback"
      "fallback": {
        "agentId": "simple-search-agent",  // Retry with simpler Agent on failure
        "task": "..."
      }
    },
    "onTimeout": {
      "action": "return-partial",   // Return partial results on timeout
      "includePartialResult": true
    }
  }
}

35.10.3 Main Agent Fault-Tolerant Handling Pattern

# Robust task handling pattern for main Agent
async def robust_parallel_task(sub_tasks):
    run_ids = []
    for task in sub_tasks:
        run_id = await sessions_spawn(task, timeout=300)
        run_ids.append(run_id)
    
    # Wait for results, continue on timeout (don't block entire workflow)
    results = await sessions_wait(
        run_ids,
        timeout=360,
        on_timeout="return-partial"
    )
    
    successful = [r for r in results if r["status"] == "completed"]
    failed = [r for r in results if r["status"] != "completed"]
    
    if len(successful) < len(sub_tasks) * 0.5:
        # More than half failed — overall task failure
        return {"status": "failed", "reason": "Too many sub-tasks failed"}
    
    # Aggregate successful results, use simple fallback for failed sub-tasks
    final = synthesize(
        successful_results=successful,
        failed_tasks=failed,
        fallback_note="Some data could not be collected due to timeout; report may be incomplete"
    )
    return final

35.11 Sub-agents Debugging and Monitoring

35.11.1 Viewing Sub-Agent Run Status

# View all active sub-Agent runs
openclaw subagents list --active

# Output:
# RunID                                  Agent           Status      Duration  Progress
# ─────────────────────────────────────────────────────────────────────────────────────
# sub-7f8a9b2c-4d3e-11ec-81d3          research-agent  running     1m 23s    ~60%
# sub-8a9b3c4d-5e4f-22fd-92e4          legal-agent     running     1m 23s    ~40%
# sub-9b0c4d5e-6f5g-33ge-a3f5          market-agent    completed   58s       100%

# View execution logs for a specific sub-Agent
openclaw subagents logs sub-7f8a9b2c-4d3e-11ec-81d3

# Manually cancel a sub-Agent
openclaw subagents cancel sub-7f8a9b2c-4d3e-11ec-81d3

35.11.2 Sub-agents Performance Analysis

# Generate task execution performance report
openclaw subagents report --session session-main-abc123

# Output:
# Sub-agent Performance Report
# ═══════════════════════════════════════════
# Session: session-main-abc123
# Main Agent: coordinator
# 
# Sub-task Summary:
#   Total tasks: 4
#   Completed: 3
#   Timed out: 1
#   Total elapsed: 248s
#   Serial equivalent: 712s
#   Speedup factor: 2.87x
# 
# Token Usage:
#   Main agent: 3,241 tokens
#   Sub-agents: 22,847 tokens (avg 7,615/agent)
#   Total: 26,088 tokens
#   Estimated cost: $0.42
# 
# Recommendations:
#   - sub-9b0c4d5e timed out: consider increasing timeout to 360s
#   - sub-market-agent ran in 58s (fastest): good task design

35.12 Summary

Sub-agents are the OpenClaw architectural feature that most embodies the concept of "Agent as work coordinator." Through sessions_spawn's non-blocking mechanism, the main Agent transforms from "doing everything one by one" to "intelligently assigning and coordinating the parallel work of multiple specialists."

Core points summarized:

Non-blocking is key: sessions_spawn immediately returns a runId; the main Agent doesn't wait and can simultaneously manage multiple parallel tasks
Isolation is the security guarantee: Each sub-Agent has an independent Session; tool permissions can only be inherited and restricted, never expanded — effectively preventing privilege escalation
Structured communication (2026.2.17): Deterministic Agent IDs + standardized IPC message format make complex inter-Agent collaboration possible
Cross-Agent memory sharing: Via extraCollections configuration, sub-Agents can read main Agent's contextual knowledge
Concurrency limit of 8: Design tasks with batching strategy; more than 8 parallel tasks must be batched
Error handling must be robust: Configure retryPolicy, onTimeout, and fallback to prevent a single sub-task failure from causing the entire task to collapse

In the book's final chapters, we will synthesize content from all 35 chapters to explore how to design a complete, production-ready OpenClaw multi-Agent system architecture.

Chapter keywords: Sub-agents, sessions_spawn, non-blocking execution, runId, isolated Session, structured communication, cross-Agent memory, ACP Harness, Native Plugin, concurrency limit, error handling

Rate this chapter

4.8 / 5 (3 ratings)