← Back to Skills Marketplace
tanganke

fusion-bench

by tanganke · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ Security Clean
205
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install fusion-bench
Description
Use FusionBench to run model fusion experiments. Covers running benchmarks, adding new merging algorithms, evaluating fused models, and managing model pools....
README (SKILL.md)

FusionBench Skill

FusionBench is a comprehensive benchmark/toolkit for deep model fusion (model merging).

Paper: arXiv:2406.03280
PyPI: pip install fusion-bench
Repo: https://code.tanganke.com/tanganke/fusion_bench
Docs: https://tanganke.github.io/fusion_bench/

Quick Start

# Install
pip install fusion-bench

# Run a simple experiment (CLIP ViT-B/32, task arithmetic on 8 tasks)
fusion_bench method=task_arithmetic modelpool=clip-vit-base-patch32 taskpool=clip-vit-base-patch32_8tasks

# Run with different merging method
fusion_bench method=ties_merging modelpool=clip-vit-base-patch32 taskpool=clip-vit-base-patch32_8tasks

Architecture Overview

fusion_bench/
├── method/           # Merging algorithms (30+)
├── modelpool/        # Model loading & management
├── config/           # Hydra YAML configs
├── tasks/            # Task evaluation
├── utils/            # Helpers (state_dict ops, lazy loading, etc.)
└── scripts/          # CLI & web UI

Key Components

  1. ModelPool: Loads and manages pre-trained/fine-tuned models

    • AutoModelPool: Auto-selects based on config
    • CLIPVisionModelPool: For CLIP ViT models
    • CausalLMPool: For Llama, GPT-2, etc.
  2. Method: The merging algorithm

    • Inherits from BaseModelFusionAlgorithm
    • Implements run(modelpool) → merged model
  3. TaskPool: Evaluation tasks

    • CLIP: 8-38 classification tasks
    • LLM: ARC, HellaSwag, MMLU, etc.

Supported Merging Methods

Basic

Method Config Name Description
Simple Average simple_average Uniform weight averaging
Weighted Average weighted_average Learnable task weights
Task Arithmetic task_arithmetic task_vector = fine-tuned - base
Slerp slerp Spherical interpolation

Sparse/Pruning

Method Config Name Description
TIES ties_merging Trim, Elect, Sign + merge
DARE dare Drop And REscale
Magnitude Pruning magnitude_pruning Prune by magnitude

Advanced

Method Config Name Description
AdaMerging adamerging Learn layer-wise coefficients
Fisher Merging fisher_merging Fisher-weighted merging
RegMean regmean Regression mean (closed-form)
RegMean++ regmean_plusplus Enhanced RegMean with cross-layer deps

MoE-Based

Method Config Name Description
WE-MoE we_moe Weight Ensembling MoE
PWE-MoE pwe_moe Pareto-optimal WE-MoE
RankOne-MoE rankone_moe Rank-1 expert decomposition
Sparse-WE-MoE sparse_we_moe Sparse weight ensembling

Continual Merging

Method Config Name Description
OPCM opcm Orthogonal Projection Continual Merging
DOP dop Dual Orthogonal Projection
Gossip gossip Gossip-based continual merging

Specialized

Method Config Name Description
ISO-C/CTS isotropic_merging Isotropic merging in common/task subspace
AdaSVD ada_svd SVD-based adaptive merging
WUDI wudi Wasserstein distance merging
ExPO expo Exponential task vectors

Running Experiments

1. Basic Merging (CLI)

# Task Arithmetic on CLIP ViT-B/32
fusion_bench \
  method=task_arithmetic \
  modelpool=clip-vit-base-patch32 \
  taskpool=clip-vit-base-patch32_8tasks

# TIES merging with custom scaling
fusion_bench \
  method=ties_merging \
  method.scaling_coefficient=0.3 \
  modelpool=clip-vit-base-patch32 \
  taskpool=clip-vit-base-patch32_8tasks

2. LLM Merging

# Merge Llama models
fusion_bench \
  method=task_arithmetic \
  modelpool=llama2-7b \
  taskpool=llama2-7b_tasks

# With DARE
fusion_bench \
  method=dare \
  method.type=task_arithmetic \
  modelpool=llama2-7b

3. Using Fabric (Distributed/Mixed Precision)

fusion_bench \
  fabric=deepspeed_stage_2 \
  method=adamerging \
  modelpool=clip-vit-base-patch32

Adding a New Method

Step 1: Create method file

# fusion_bench/method/my_method.py
from fusion_bench.method.base_algorithm import BaseModelFusionAlgorithm
from fusion_bench.modelpool import BaseModelPool
import torch

class MyMergingAlgorithm(BaseModelFusionAlgorithm):
    """
    My custom merging algorithm.
    """
    def __init__(self, scaling_coefficient: float = 1.0, **kwargs):
        super().__init__(**kwargs)
        self.scaling_coefficient = scaling_coefficient
    
    @torch.no_grad()
    def run(self, modelpool: BaseModelPool):
        # 1. Load base model
        base_model = modelpool.load_model("_base_")
        base_sd = base_model.state_dict()
        
        # 2. Compute merged task vectors
        merged_tv = {}
        for model_name in modelpool.model_names:
            if model_name == "_base_":
                continue
            model = modelpool.load_model(model_name)
            tv = {k: v - base_sd[k] for k, v in model.state_dict().items()}
            # Your merging logic here
            for k in tv:
                if k not in merged_tv:
                    merged_tv[k] = tv[k] * self.scaling_coefficient
                else:
                    merged_tv[k] += tv[k] * self.scaling_coefficient
        
        # 3. Apply merged task vector
        for k in base_sd:
            base_sd[k] += merged_tv.get(k, 0)
        
        base_model.load_state_dict(base_sd)
        return base_model

Step 2: Register in __init__.py

# fusion_bench/method/__init__.py
_import_structure = {
    ...
    "my_method": ["MyMergingAlgorithm"],
}

Step 3: Create config

# config/method/my_method.yaml
_target_: fusion_bench.method.my_method.MyMergingAlgorithm
scaling_coefficient: 1.0

Step 4: Run

fusion_bench method=my_method modelpool=clip-vit-base-patch32

Model Pool Configuration

CLIP Models

# config/modelpool/clip-vit-base-patch32.yaml
_target_: fusion_bench.modelpool.CLIPVisionModelPool
model_names:
  - _base_
  - Cars
  - DTD
  - EuroSAT
  - GTSRB
  - MNIST
  - RESISC45
  - SUN397
  - SVHN
model_dir: ${oc.env:HOME}/.cache/fusion_bench/models

LLM Models

# config/modelpool/llama2-7b.yaml
_target_: fusion_bench.modelpool.CausalLMPool
model_names:
  - _base_
  - arc
  - hellaswag
  - mmlu
model_dir: ${oc.env:HOME}/.cache/fusion_bench/llama_models

Utilities

State Dict Arithmetic

from fusion_bench.utils.state_dict_arithmetic import StateDict

# Convenient operations on state dicts
sd1 = StateDict(model1.state_dict())
sd2 = StateDict(model2.state_dict())

merged = sd1 + sd2           # Add
diff = sd1 - sd2             # Subtract
scaled = sd1 * 0.5           # Scale
tv_merged = sd1 + 0.3 * sd2  # Linear combination

Lazy State Dict

from fusion_bench.utils.lazy_state_dict import LazyStateDict

# Load large models without OOM
lazy_sd = LazyStateDict.from_file("model.safetensors")
# Only loads tensors when accessed

Common Workflows

1. Evaluate a single merged model

from fusion_bench import AutoModelPool
from fusion_bench.method import SimpleAverageAlgorithm

pool = AutoModelPool.from_config("config/modelpool/clip-vit-base-patch32.yaml")
method = SimpleAverageAlgorithm()
merged_model = method.run(pool)

# Evaluate on tasks
for task_name in pool.model_names:
    if task_name == "_base_":
        continue
    acc = evaluate(merged_model, task_name)
    print(f"{task_name}: {acc:.2%}")

2. Hyperparameter search

# Sweep scaling coefficient
for coeff in 0.2 0.4 0.6 0.8 1.0; do
  fusion_bench \
    method=task_arithmetic \
    method.scaling_coefficient=$coeff \
    modelpool=clip-vit-base-patch32
done

3. Compare multiple methods

for method in simple_average task_arithmetic ties_merging dare; do
  echo "=== $method ==="
  fusion_bench \
    method=$method \
    modelpool=clip-vit-base-patch32 \
    taskpool=clip-vit-base-patch32_8tasks
done

Tips

  1. Memory: Use fabric=deepspeed_stage_2 for large models
  2. Caching: Models are cached in ~/.cache/fusion_bench/
  3. Reproducibility: Set seed=42 in config
  4. Debugging: Use hydra.verbose=true for detailed logs
  5. Web UI: Run fusion_bench_webui for interactive exploration

Related Papers

  1. FusionBench (arXiv:2406.03280) - The benchmark paper
  2. SMILE (arXiv:2408.10174) - Sparse MoE from pre-trained models
  3. WE-MoE - Weight Ensembling MoE for multi-task merging
  4. OPCM/DOP - Continual model merging methods
  5. RegMean++ (arXiv:2508.03121) - Enhanced RegMean
Usage Guidance
This skill is coherent for running FusionBench experiments, but take these precautions before installing or running it: - Verify the PyPI package and source repository: check the fusion-bench package page on PyPI, confirm the package owner, review the package files, and inspect the linked repository (the SKILL.md repo is on code.tanganke.com rather than GitHub). Malicious packages can be distributed via PyPI. - Inspect the code before installing or run installation in a sandbox/container. pip install will download and run code on your machine. - Expect large downloads and heavy compute: merging LLMs and CLIP models can require substantial disk, memory, and possibly cloud/GPU resources. Ensure you understand where models will be pulled from (local paths vs. model hubs) and whether tokens/keys are needed. - If you'll load models from model hubs (Hugging Face, private storage), ensure any access tokens are granted only to trusted code and revoke them if unsure. - If you need higher assurance, ask the publisher for source verification (a public VCS like GitHub with tags/releases) or request a signed release. If you lack the ability to audit the package, consider running it in an isolated environment or using a vetted alternative.
Capability Analysis
Type: OpenClaw Skill Name: fusion-bench Version: 1.0.0 The fusion-bench skill bundle is a legitimate integration for the FusionBench model merging toolkit (arXiv:2406.03280). The SKILL.md file contains standard documentation, installation instructions via pip, and CLI usage examples consistent with the tool's purpose. There are no signs of data exfiltration, malicious execution, or prompt injection attacks.
Capability Assessment
Purpose & Capability
Name, description, and SKILL.md all describe running model-fusion experiments and adding merging algorithms; nothing requested by the skill (no env vars, no unusual binaries, no config paths) appears unrelated to that purpose.
Instruction Scope
SKILL.md contains CLI usage, example commands, and code snippets for adding methods. It does not instruct the agent to read unrelated files, exfiltrate data, or access unrelated system credentials. It does assume loading model weights and optionally using distributed runtimes (deepspeed/Fabric), which is consistent with the task.
Install Mechanism
The skill is instruction-only (no install spec), but the docs instruct the user to 'pip install fusion-bench' (PyPI) and link to a repo hosted at code.tanganke.com rather than a well-known host. Installing the PyPI package will execute code from an external source — verify the PyPI package and repository before installing.
Credentials
The skill declares no required environment variables and the instructions do not request secrets. However, at runtime loading certain models (e.g., Llama variants or models on huggingface.co) or using cloud/deepspeed could require access tokens, cloud credentials, or large compute resources; these are not requested by the skill itself but could be needed by the underlying tooling.
Persistence & Privilege
Skill is not always-enabled, is user-invocable, has no install spec or code that would modify other skills or agent-wide settings. It does not request persistent privileges.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install fusion-bench
  3. After installation, invoke the skill by name or use /fusion-bench
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
FusionBench skill v1.0.0 initial release: - Provides a comprehensive toolkit for deep model fusion and benchmarking. - Supports 30+ merging algorithms (simple average, TIES, AdaMerging, MoE-based, continual, specialized, and more). - Enables benchmarking and evaluation for CLIP models and LLMs on a wide variety of tasks. - Includes utilities for state dict arithmetic and lazy loading for large model files. - Offers clear architecture, extensibility guides, and step-by-step instructions for adding new model merging methods.
Metadata
Slug fusion-bench
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is fusion-bench?

Use FusionBench to run model fusion experiments. Covers running benchmarks, adding new merging algorithms, evaluating fused models, and managing model pools.... It is an AI Agent Skill for Claude Code / OpenClaw, with 205 downloads so far.

How do I install fusion-bench?

Run "/install fusion-bench" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is fusion-bench free?

Yes, fusion-bench is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does fusion-bench support?

fusion-bench is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created fusion-bench?

It is built and maintained by tanganke (@tanganke); the current version is v1.0.0.

💬 Comments