← Back to Skills Marketplace
windseeker1111

FlowSwarm — Swarm Coding Framework for OpenClaw

by windseeker1111 · GitHub ↗ · v2.1.1 · MIT-0
cross-platform ⚠ suspicious
117
Downloads
0
Stars
0
Active Installs
2
Versions
Install in OpenClaw
/install flow-swarm
Description
Multi-agent swarm orchestration via RuFlo + Claude Code. Turns single coding sessions into coordinated agent teams (architect/coder/tester/reviewer). Proven...
README (SKILL.md)

FlowSwarm v2.1

Multi-agent swarm orchestration for Claude Code via RuFlo. One prompt, coordinated agents, production results.

What Changed in v2.1

Critical fix: MCP tools were disabled (autoStart: false). This prevented Claude Code from calling mcp__claude-flow__swarm_init, memory_store, agent_spawn, etc. during every swarm run we'd done so far. Fixed.

Also fixed: --print mode does not auto-discover .mcp.json. You must pass --mcp-config .mcp.json explicitly.

Change Why
autoStart: true in .mcp.json Was false — all 150+ MCP tools were disabled in every prior run
--mcp-config .mcp.json flag added to exec pattern --print doesn't auto-load project MCP config
MCP tool reference table added 150+ tools now documented: swarm_init, agent_spawn, memory_store, etc.
Prompt templates updated to call MCP tools Without explicit instructions, Claude may not use them
Setup script auto-fixes autoStart ruflo init defaults to false; setup script corrects it

What Changed in v2.0

v1.0 was theory. v2.0 is battle-tested across 5 production runs (355 tests, 5/5 green, 4/5 zero-iteration).

Change Why
Tiered task routing replaces one-size-fits-all Pure-data modules don't need GenServer test isolation advice
Target selection protocol added Picking the RIGHT module matters more than swarm config
Pre-flight context injection Feeding the swarm grep output of public functions = dramatically better coverage
Daemon reality check Workers timeout/fail often (20-50% success); swarm value comes from prompt orchestration, not daemon workers
Removed WASM Booster claims Never observed in practice; hooks + prompt patterns drive all real value
Real performance data Actual timing, test counts, iteration rates from production runs

Architecture

Two layers working together:

Layer 1: MCP Tools (150+ tools via @claude-flow/cli) When autoStart: true in .mcp.json, Claude Code gets access to real coordination tools:

  • swarm_init — creates swarm with topology, persists to .claude-flow/swarm/swarm-state.json
  • agent_spawn — registers agents with model routing (haiku/sonnet/opus/inherit)
  • memory_store / memory_search — sql.js + HNSW vector embeddings for semantic recall
  • task_create / task_complete — tracks task state with assignment
  • session_save / session_restore — persists session state between runs
  • claims_claim / claims_release — prevents agents from editing same files
  • coordination_consensus — multi-agent agreement on decisions

Layer 2: Prompt Orchestration (our FlowSwarm patterns) The SWARM MODE prefix causes Claude Code to think in roles (architect/coder/reviewer). Combined with pre-flight context injection (grepping public APIs), this produces 80% zero-iteration success.

Both layers matter. v1.0 had Layer 2 only (autoStart was false, MCP tools never loaded). v2.0 enables both.

OpenClaw → exec (background) → Claude Code
                                    ↓
                    MCP Server starts (autoStart: true)
                    150+ tools available via @claude-flow/cli
                                    ↓
                    SWARM MODE prompt → swarm_init tool called
                    agent_spawn × N → task_create → execute
                                    ↓
                    memory_store (findings) → task_complete
                                    ↓
                              Output + persisted state

Prerequisites

ruflo --version    # 3.5.x+
claude --version   # Claude Code CLI

Setup (One-Time Per Machine)

# Full setup: install RuFlo + register MCP + init project
./scripts/setup-flow-swarm.sh /path/to/project

# Verify
./scripts/setup-flow-swarm.sh --verify /path/to/project

Or manually:

npm install -g ruflo@latest
claude mcp add ruflo -- npx -y ruflo@latest mcp start
cd /path/to/project && ruflo init && ruflo memory init && ruflo daemon start

CRITICAL: Enable MCP Server

After ruflo init, the .mcp.json file defaults to autoStart: false. This disables ALL 150+ MCP tools during Claude Code sessions. Fix it:

# Check current state
python3 -c "import json; d=json.load(open('.mcp.json')); print('autoStart:', d['mcpServers']['claude-flow'].get('autoStart'))"

# Enable (REQUIRED for full swarm functionality)
python3 -c "
import json
with open('.mcp.json') as f: d = json.load(f)
d['mcpServers']['claude-flow']['autoStart'] = True
with open('.mcp.json', 'w') as f: json.dump(d, f, indent=2)
print('MCP autoStart enabled')
"

Without this, Claude Code runs without swarm tools. The prompt patterns still work (v1.0 proved this), but you lose: persistent swarm state, agent memory, task tracking, session persistence, and inter-agent coordination.

The FlowSwarm Protocol (3 Steps)

Step 1: Select Your Target

This is the highest-leverage decision. Pick wrong and you waste a swarm run.

Best targets (in order):

  1. Large modules with zero tests — highest ROI, swarm excels here
  2. Pure data/logic modules — no IO mocking needed, near-100% first-pass success
  3. Modules with thin test coverage — swarm fills gaps the original author skipped
  4. Feature builds with clear specs — architect/coder/reviewer shines on greenfield

Find targets fast:

# List untested modules by size (biggest = best target)
for f in lib/**/*.ex; do
  base=$(basename "$f" .ex)
  count=$(find test/ -name "${base}_test.exs" 2>/dev/null | wc -l | tr -d ' ')
  [ "$count" = "0" ] && echo "$(wc -l \x3C "$f")L $f"
done | sort -rn | head -10

Step 2: Build the Prompt (Context-Rich)

The secret sauce: feed the swarm a pre-flight scan of the module. Don't just say "test this file" — tell it exactly what functions exist, what patterns the project uses, what edge cases matter.

# Pre-flight: scan public API
grep -n "^  def " lib/your_module.ex
# Pre-flight: check existing test patterns
head -30 test/some_existing_test.exs

Then build the prompt with that intel baked in.

Step 3: Launch and Verify

# Launch (ALWAYS background, NEVER nohup)
exec(
  command='cd /project && claude --permission-mode bypassPermissions --mcp-config .mcp.json --print "SWARM MODE: ... TASK: ..."',
  background=True,
  timeout=300
)

# Poll for completion
process(action="poll", sessionId="xxx", timeout=120000)

# Verify the output actually compiles/passes
mix test test/path/to/new_test.exs

Critical exec rules:

  • --print buffers ALL output until exit. Use background: true + poll.
  • NEVER use nohup — Node.js stdout capture breaks silently (empty files).
  • Timeout 300s minimum for complex tasks. Simple test gen: 60-120s.
  • Always run mix test (or equivalent) on swarm output before committing.

Prompt Templates (Battle-Tested)

Test Generation — Pure Data Module

Proven: 147/147, 66/66, 41/41 zero-iteration

Best for: static catalogs, type definitions, translation modules, config builders.

SWARM MODE: Initialize hierarchical swarm with MCP tools.

COORDINATION:
1. Call swarm_init with topology "hierarchical", maxAgents 4, strategy "specialized"
2. Call agent_spawn for: architect (analyze module), coder (write tests), reviewer (verify)
3. Call task_create for the test generation task
4. After completion: call memory_store with key findings and task_complete

TASK: Write comprehensive ExUnit tests for [MODULE_PATH] ([LINE_COUNT] lines, [DESCRIPTION]).

Public API:
[PASTE grep -n "^  def " output here]

Key data to validate:
- [List specific assertions: required struct keys, value ranges, URL formats, etc.]
- [List known edge cases: unknown inputs, nil, empty string, integer where string expected]

Requirements:
- File: test/[matching_path]_test.exs
- Use async: true (pure functions, no state)
- Group tests by function (describe blocks)
- Test ALL variants, not just a sample (e.g., all 8 hotels, not just 2)
- Include edge cases: nil input, empty string, unknown keys
- Do NOT modify any source files

When done: call memory_store with test count + key findings, then output results.

Test Generation — GenServer / Stateful Module

Proven: 43/43, required 3 iterations (test isolation)

SWARM MODE: Initialize hierarchical swarm with MCP tools.

COORDINATION:
1. Call swarm_init with topology "hierarchical", maxAgents 4, strategy "specialized"
2. Call agent_spawn for: architect (analyze GenServer behavior), coder (write tests), reviewer (verify)
3. Call task_create for the test generation task
4. After each iteration: call memory_store with what failed and why
5. After completion: call task_complete with final results

TASK: Write comprehensive ExUnit tests for [MODULE_PATH] ([LINE_COUNT] lines, GenServer).

Public API:
[PASTE grep output]

CRITICAL — Test Isolation for GenServers:
- The module registers as a named process (__MODULE__). It's already supervised globally.
- Do NOT use start_supervised! — it conflicts with the app-supervised instance.
- Pattern: stop the global instance, restart with test config, re-stop at end.
- OR: if start_link accepts a name: option, use unique names per test.
- async: false for GenServer tests that touch global state.

Requirements:
- Test GenServer lifecycle (init, handle_call, handle_cast, handle_info)
- Test crash recovery: missing catch-all handlers are REAL BUGS worth flagging
- Test state transitions and side effects
- Do NOT modify any source files

Test Generation — Module with External Dependencies

SWARM MODE: Initialize hierarchical swarm (maxAgents 4, strategy specialized).
Spawn: architect (analyze deps + plan mocks), coder (write tests), reviewer (verify coverage).

TASK: Write comprehensive ExUnit tests for [MODULE_PATH].

This module depends on: [LIST DEPENDENCIES]
Mock strategy: [Mox / manual mock / test config override]
Reference existing mocks in test/support/ if any.

Requirements:
- Mock all external calls (HTTP, DB, external services)
- Test happy path AND error paths (timeouts, 4xx, 5xx, malformed responses)
- async: true if using Mox with allowances
- Do NOT modify source files

Feature Build (Greenfield)

SWARM MODE: Initialize hierarchical swarm (maxAgents 6, strategy specialized).
Spawn: architect (plan structure), coder (implement), tester (tests), reviewer (quality).
Architect plans FIRST. Coder implements. Tester validates. Reviewer catches issues.

TASK: [Feature description with clear acceptance criteria]

Architecture constraints:
- [List patterns to follow from existing codebase]
- [List modules/files to reference for conventions]

HARD LIMIT: Maximum 5 iterations if quality loop needed.

Refactor (Anti-Drift)

SWARM MODE: Initialize anti-drift hierarchical swarm (maxAgents 4).
Spawn: architect (plan + checkpoint), coder (execute), reviewer (validate each step).

ANTI-DRIFT RULES:
- Architect creates numbered plan before ANY code is written
- Coder implements ONE step at a time
- Reviewer validates EACH step before proceeding
- If reviewer rejects twice: STOP and report
- Checkpoint state after each successful step

TASK: [Refactor description]
HARD LIMIT: Maximum 8 iterations.

Security Audit

SWARM MODE: Security-focused hierarchical swarm (maxAgents 5, strategy specialized).
Spawn: security-architect (threat model), auditor (scan), coder (fix), tester (verify).

TASK: Security audit of [scope].

Checklist:
- [ ] Dependency vulnerabilities (mix audit / npm audit)
- [ ] Hardcoded secrets in source
- [ ] Injection vectors (SQL, XSS, command)
- [ ] Auth/authz bypass paths
- [ ] GenServer catch-all handlers (handle_info, handle_cast) — these are REAL BUGS
- [ ] Error messages leaking internal state
- [ ] Rate limiting gaps
- [ ] CORS/CSP headers

Output: findings table with severity, file, line, fix.
HARD LIMIT: Maximum 5 iterations.

Quality Loop (Ralph-Style)

SWARM MODE: Initialize ring swarm (maxAgents 4, strategy adaptive).
Spawn: coder, tester, reviewer, coordinator.
HARD LIMIT: Maximum 10 iterations.

TASK: Iterate on [target] until [score threshold].

Per iteration:
1. Coder fixes based on reviewer feedback
2. Tester runs full suite, reports pass/fail count
3. Reviewer scores against rubric
4. Score >= threshold → STOP, report final score
5. Iteration == 10 → STOP regardless, report score and remaining gaps

MCP Tools Reference (Available When autoStart: true)

These tools become available to Claude Code during swarm sessions. Include instructions to USE them in your prompts.

Core Swarm (must-use)

Tool Purpose
swarm_init Create swarm with topology + strategy. Persists to .claude-flow/swarm/
swarm_status Check swarm health mid-run
swarm_shutdown Clean shutdown with state persistence
agent_spawn Register agents with model routing (haiku/sonnet/opus)
agent_status Check individual agent state
memory_store Persist findings to sql.js + HNSW (semantic search)
memory_search Retrieve relevant context from prior runs
task_create Track task with assignment + status
task_complete Mark task done with summary

Coordination (high-value for complex tasks)

Tool Purpose
session_save Save session state between runs
session_restore Resume from prior session
claims_claim Lock a file/resource (prevents agent conflicts)
claims_release Release lock
coordination_consensus Multi-agent agreement
coordination_sync Synchronize agent state

Analysis (useful for reviews)

Tool Purpose
analyze_diff Review code changes
analyze_diff_risk Assess risk of changes
performance_report Bottleneck detection

Why This Matters

Without autoStart: true, Claude Code has ZERO access to these tools. It runs on prompt intelligence alone (which works, as v1.0 proved). With them enabled, the swarm can:

  • Persist memories between runs — learn from prior sessions
  • Track tasks formally — not just in-context reasoning
  • Coordinate agents — prevent file conflicts, reach consensus
  • Route by model — use haiku for simple subtasks, opus for architecture

Swarm Topologies

Topology When Track Record
hierarchical Test gen, features, refactors 5/5 green tonight
ring Quality loops, pipelines Proven in Ralph loops
mesh Research, exploration Untested in production
star Simple delegation Untested in production

Default: hierarchical. It has the strongest anti-drift properties and all production wins used it.

Performance Data (Real, Not Theoretical)

Test Generation Runs (March 23, 2026)

Module Lines Tests Time Iterations Result
AssetHealthCheck (GenServer) 288 43 ~120s 3 43/43 ✅ + found 2 real bugs
WidgetHotelAssets (data) 2,017 147 ~90s 0 147/147 ✅
SquadBuilder (config builder) 889 58 ~90s 0 58/58 ✅
I18n (translations) 272 41 ~60s 0 41/41 ✅
TravelClick Types (structs) 637 66 ~90s 0 66/66 ✅

Totals: 355 new tests, 5/5 modules green, 4/5 zero-iteration (80%)

Key Observations

  • Pure data modules: near-100% first-pass success. No iteration needed.
  • GenServer modules: expect 2-3 iterations for test isolation issues.
  • Real bugs found: 2 missing catch-all handlers (handle_cast, handle_info) — production-grade findings.
  • Data quality issues found: 1 (duplicate image URLs across hotel rooms).
  • Execution time: 60-120s per module regardless of size (bottleneck is Claude Code --print buffering, not swarm complexity).

Daemon Worker Reality Check

Worker Success Rate Notes
map 100% Fast (1ms), just indexes project structure
consolidate 100% Fast (9ms), memory compaction
audit 20% Timeouts at 300s, falls back to local mode
optimize 33% Timeouts, deferred on high CPU load
testgaps 50% Deferred on high CPU load
predict 0% Disabled by default
document 0% Disabled by default

Takeaway: Don't rely on daemon workers for task quality. The prompt pattern does the heavy lifting. Daemon adds marginal background value (map + consolidate work; audit/optimize are unreliable).

Parallel Swarm Runs

You can run multiple swarms simultaneously on different modules. Each gets its own exec session:

# Launch 2 parallel swarms
exec(command='cd /project && claude --mcp-config .mcp.json --print "SWARM: ... TASK: test module_a"', background=True)
exec(command='cd /project && claude --mcp-config .mcp.json --print "SWARM: ... TASK: test module_b"', background=True)

# Poll both
process(action="poll", sessionId="session-a", timeout=120000)
process(action="poll", sessionId="session-b", timeout=120000)

Observed: 2 parallel swarms work cleanly. 3+ may cause CPU load deferrals on daemon workers (irrelevant for prompt-driven value).

Self-Improvement Protocol

FlowSwarm can analyze and improve itself:

SWARM MODE: Initialize meta-analysis hierarchical swarm (maxAgents 4).
Spawn: architect (analyze skill files), analyst (review production data), coder (rewrite), reviewer (validate).

TASK: Analyze the FlowSwarm skill at [path] against production run data.
Review: what worked, what failed, what's missing. Generate v[N+1].

Troubleshooting

No output from Claude Code: --print buffers until completion. Use background: true on exec, poll with generous timeout. Never use nohup.

Swarm didn't fire:

  1. Check .claude/settings.json has hooks
  2. Run ruflo doctor
  3. Verify MCP: claude mcp list | grep ruflo
  4. Restart: ruflo daemon stop && ruflo daemon start

GenServer test isolation failures: The global supervised instance conflicts with test instances. Solutions:

  • Stop global, restart for test, cleanup after
  • Use unique names: name: :"test_#{System.unique_integer()}"
  • Set async: false for stateful tests

Daemon workers timing out: Normal. Workers like audit and optimize timeout at 300s regularly (20-33% success rate). The swarm's value comes from prompt orchestration, not daemon workers. Ignore worker failures unless you specifically need their output.

Memory shows 0 entries: In v1.0 (autoStart: false), the MCP server never started so memory_store was never called. With v2.0 (autoStart: true), Claude Code can call memory_store directly. Check after a run:

ruflo memory stats
ruflo memory search -q "test results"

CPU load deferrals: Workers defer when system CPU > 8. This is protective. During active swarm runs, expect deferrals. Workers catch up when CPU drops.

Files

skills/flow-swarm/
├── SKILL.md                          # This file (v2.0)
├── scripts/
│   └── setup-flow-swarm.sh           # Install + init + verify
└── references/
    └── template-examples.md          # Extended templates with context

Changelog

v2.1.0 (2026-03-23)

  • CRITICAL FIX: Enabled MCP autoStart (was false, disabling ALL 150+ swarm tools)
  • Added MCP tool reference table (swarm_init, agent_spawn, memory_store, etc.)
  • Updated prompt templates to instruct Claude Code to USE MCP tools
  • Setup script now auto-fixes autoStart: false → true
  • Verify mode checks autoStart status
  • Documented full MCP tool inventory (150+ tools across 18 categories)
  • Root cause of "memory shows 0 entries" identified: MCP server wasn't running

v2.0.0 (2026-03-23)

  • Complete rewrite based on 5 production swarm runs (355 tests generated)
  • Added tiered prompt templates (pure-data vs GenServer vs external deps)
  • Added target selection protocol + --targets script flag
  • Added pre-flight context injection pattern
  • Added real performance data table with actual timing and iteration counts
  • Added parallel swarm run documentation
  • Added daemon worker reality check (success rates, what to ignore)
  • Added self-improvement protocol
  • Removed unverified WASM Booster performance claims
  • Fixed: documented that --print mode doesn't persist swarm memories

v1.0.0 (2026-03-23)

  • Initial release based on first swarm run (AssetHealthCheck)
Usage Guidance
This skill appears to implement what it claims, but it asks you (via its scripts and README) to enable a local MCP server and daemon that give Claude Code programmatic access to many tools (file read/write, spawn agents, persistent memory, logs, and project config). Before installing: 1) Treat the setup as a privileged operation—run it only in an isolated environment (throwaway VM/container) and not on sensitive repos. 2) Inspect .mcp.json and the list of MCP tools to understand exactly what the model can do; do not accept blanket autoStart=true without review. 3) Verify the origin of the ruflo npm package and avoid running global npm installs as root; prefer installing in an isolated node environment. 4) If the repo contains secrets or sensitive data, do not enable 'use real data' or run the memory/daemon steps. 5) Ask the author for provenance (homepage, source repo, maintainer identity) and for an explicit list of the MCP tools that will be enabled. If you need to proceed: run the script with --verify first, and consider auditing memory.db/.claude logs and .mcp.json afterwards. If you want reduced risk, use only the prompt templates locally (without enabling the MCP server) or run the full flow in a sandboxed environment.
Capability Assessment
Purpose & Capability
The name/description (RuFlo + Claude Code swarm orchestration) aligns with the SKILL.md and the setup script: it installs/initializes RuFlo, registers an MCP server, and enables many MCP tools. However, the registry metadata claims no required binaries/env vars, while SKILL.md and the script require node/npm, ruflo, claude CLI, git, python3 and a project .mcp.json; that mismatch (manifest says 'none' but instructions require many tools) is a transparency issue.
Instruction Scope
SKILL.md and templates explicitly instruct the agent to read project files, grep source (public functions), read .claude logs and daemon logs, inspect git history, access/modify .mcp.json, initialize persistent memory DBs, and use 'real data' from the repo. The meta templates and security-audit templates also instruct scanning for hardcoded secrets. Combined with enabling MCP tools, the runtime guidance grants the model broad file-system and persistence access beyond a simple prompt helper.
Install Mechanism
There is no formal install spec in the registry (instruction-only), but a provided setup script performs global npm installs (npm install -g ruflo@latest) and uses 'npx -y ruflo@latest mcp start' to register/start the MCP server. Pulling and running code from npm and using npx is common but still introduces moderate risk (remote code pulled at install time). The script also edits project files (.mcp.json) in-place.
Credentials
The manifest requests no credentials or env vars, which superficially looks least-privilege. But the SKILL.md and scripts require network access (npm, npx), write access to project files, and will spawn persistent services and databases. The skill also suggests scanning for hardcoded secrets and using real production data; those behaviors increase the chance of sensitive-data exposure even without explicit credential inputs.
Persistence & Privilege
The instructions and setup script explicitly enable and start a persistent MCP server/daemon and flip .mcp.json 'autoStart' to true so Claude Code gains access to 150+ MCP tools (memory_store, agent_spawn, file/task management, session persistence). Although the skill metadata does not set always:true, enabling these persistent background capabilities effectively grants the model enduring programmatic control and local persistence — a high-privilege outcome the user should not enable lightly.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install flow-swarm
  3. After installation, invoke the skill by name or use /flow-swarm
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v2.1.1
v2.1.1: Expanded description for ClawHub listing. Multi-agent swarm orchestration via RuFlo + Claude Code. 430+ tests, 83% zero-iteration, 150+ MCP tools, persistent cross-run memory, battle-tested prompt templates. Works with any language.
v2.1.0
Multi-agent swarm orchestration via RuFlo + Claude Code. Turns single coding sessions into coordinated agent teams. 430 tests generated across 50K+ line Elixir codebase, 83% zero-iteration rate, 7 consecutive production runs. v2.1 fixes critical MCP autoStart bug (150+ tools were disabled), adds full tool reference, and includes battle-tested prompt templates.
Metadata
Slug flow-swarm
Version 2.1.1
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 2
Frequently Asked Questions

What is FlowSwarm — Swarm Coding Framework for OpenClaw?

Multi-agent swarm orchestration via RuFlo + Claude Code. Turns single coding sessions into coordinated agent teams (architect/coder/tester/reviewer). Proven... It is an AI Agent Skill for Claude Code / OpenClaw, with 117 downloads so far.

How do I install FlowSwarm — Swarm Coding Framework for OpenClaw?

Run "/install flow-swarm" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is FlowSwarm — Swarm Coding Framework for OpenClaw free?

Yes, FlowSwarm — Swarm Coding Framework for OpenClaw is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does FlowSwarm — Swarm Coding Framework for OpenClaw support?

FlowSwarm — Swarm Coding Framework for OpenClaw is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created FlowSwarm — Swarm Coding Framework for OpenClaw?

It is built and maintained by windseeker1111 (@windseeker1111); the current version is v2.1.1.

💬 Comments