Skip to content
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -272,6 +272,7 @@ For detailed explanations, parameter descriptions, and use cases for each method
| [**Multi-SLERP** (`multislerp`)](docs/merge_methods.md#multi-slerp-multislerp) | Barycentric SLERP for multiple models. | ≥2 | * | Spherical interpolation for >2 models. |
| [**Karcher Mean** (`karcher`)](docs/merge_methods.md#karcher-mean-karcher) | Riemannian barycenter of model parameters. | ≥2 | - | Geometrically sound averaging on manifolds. |
| [**Task Arithmetic** (`task_arithmetic`)](docs/merge_methods.md#task-arithmetic-task_arithmetic) | Linearly combine "task vectors" (differences from a base). | ≥2 | ✓ | Transferring/combining fine-tuned skills. |
| [**Core Space** (`core_space`)](docs/merge_methods.md#core-space-core_space) | SVD-aligned LoRA merging in compact core subspace. | ≥2 | ✓ | Efficient LoRA merging, heterogeneous ranks, subspace alignment.|
| [**TIES** (`ties`)](docs/merge_methods.md#ties-merging-ties) | Task arithmetic + sparsification & sign consensus. | ≥2 | ✓ | Merging many models, reducing interference. |
| [**DARE** (`dare_linear`, `dare_ties`)](docs/merge_methods.md#dare-dare_linear-dare_ties) | Task arithmetic + random pruning & rescaling. | ≥2 | ✓ | Robust skill retention, similar to TIES. |
| [**DELLA** (`della`, `della_linear`)](docs/merge_methods.md#della-della-della_linear) | Task arithmetic + adaptive magnitude-based pruning. | ≥2 | ✓ | Prioritizing important changes, reducing interference. |
Expand Down
41 changes: 41 additions & 0 deletions docs/merge_methods.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
- [Karcher Mean (`karcher`)](#karcher-mean-karcher)
- [Task Vector Methods](#task-vector-methods)
- [Task Arithmetic (`task_arithmetic`)](#task-arithmetic-task_arithmetic)
- [Core Space (`core_space`)](#core-space)
- [TIES-Merging (`ties`)](#ties-merging-ties)
- [DARE (`dare_linear`, `dare_ties`)](#dare-dare_linear-dare_ties)
- [DELLA (`della`, `della_linear`)](#della-della-della_linear)
Expand Down Expand Up @@ -149,6 +150,46 @@ This guide provides detailed information about the various model merging algorit

**Reference:** [Editing Models with Task Arithmetic](https://arxiv.org/abs/2212.04089)

### Core Space (`core_space`)

**Concept**: Merges LoRA-adapted models by projecting them into a shared, aligned core space using SVD-based reference bases. Operates in a compact subspace for efficiency while preserving information.

**Algorithm**:

1. Extract LoRA matrices (B, A) from each model where ΔW = B @ A
2. Compute reference bases via SVD: concatenate all B matrices horizontally and A matrices vertically, then compute orthonormal bases U_B and V_A
3. Project to core space: Core_i = U_B^T @ B_i @ A_i @ V_A
4. Merge in core space using weighted average
5. Reconstruct: ΔW_merged = U_B @ Core_merged @ V_A^T, then W_final = W_base + ΔW_merged

**Inputs**: Requires 2 or more models, plus one `base_model`.

**Parameters**:

- `weight` (per-model, float, default: 1.0): Weight for each model. Currently uses equal weights.

**Use Cases**:

- Efficiently merging multiple LoRA adapters
- Multi-task model creation from specialized adapters
- When adapters have different ranks
- Resource-constrained environments

**Example**:

```yaml
models:
- model: meta-llama/Llama-2-7b-hf
- model: username/llama2-lora-math
- model: username/llama2-lora-code

merge_method: core_space
base_model: meta-llama/Llama-2-7b-hf
dtype: bfloat16
```

**Reference**: [Accurate and Efficient Low-Rank Model Merging in Core Space](https://arxiv.org/abs/2509.17786) (Panariello et al., NeurIPS 2025)

### TIES-Merging (`ties`)

**Concept:** Builds on Task Arithmetic by sparsifying task vectors and applying a sign consensus algorithm. This helps to resolve interference when merging multiple models and retain more of their individual strengths.
Expand Down
11 changes: 11 additions & 0 deletions examples/core_space.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
models:
- model: gpt2
parameters:
weight: 0.5
- model: gpt2
parameters:
weight: 1.0

merge_method: core_space
base_model: gpt2
dtype: float32
Loading
Loading