Skip to content

mamba2: remove hardcoded 2x expansion factor and invalid d_inner % d_state check#23082

Merged
ggerganov merged 7 commits into
ggml-org:masterfrom
limloop:fix/mamba2-any-expand
Jun 26, 2026
Merged

mamba2: remove hardcoded 2x expansion factor and invalid d_inner % d_state check#23082
ggerganov merged 7 commits into
ggml-org:masterfrom
limloop:fix/mamba2-any-expand

Conversation

@limloop

@limloop limloop commented May 15, 2026

Copy link
Copy Markdown
Contributor

This PR removes two unnecessary restrictions in Mamba2 that prevent loading models with custom architectures.

Changes:

  1. Remove hardcoded 2x expansion factor (GGML_ASSERT(2 * n_embd == d_inner))

    • The assert assumed all Mamba2 models have expand=2
    • expand is not stored in GGUF, only d_inner is
    • Removing it allows models with any expansion factor (1, 2, 3, etc.)
  2. Remove invalid d_inner % d_state check

    • In Mamba2, d_inner and d_state are unrelated parameters
    • This assert has no architectural justification for Mamba2

Testing:

  • ✅ Default Mamba2 (expand=2) — loads and runs correctly
  • ✅ Custom model (expand=1, d_inner=512, d_model=512) — loads and generates coherent output

Backward compatibility: Models with expand=2 work identically to before.

Related discussion: #21346

@limloop limloop requested a review from CISC as a code owner May 15, 2026 01:48
@github-actions github-actions Bot added model Model specific python python script changes labels May 15, 2026
@ggml-gh-bot

ggml-gh-bot Bot commented May 15, 2026

Copy link
Copy Markdown

Hi @limloop, thanks for your contribution!

Per our contribution guidelines, the automated PR checker found the following issue(s) that need your attention:

  • AI-generated content: This project does not accept PRs, descriptions or commit messages that are fully or predominantly AI-generated. If you have used AI to assist you in writing code, please make sure to disclose that explicitly.

Please note that maintainers reserve the right to make final decisions on PRs. If you believe there is a mistake, please comment below.

@CISC CISC left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have links to models with differing expand?

Comment thread convert_hf_to_gguf.py Outdated
@CISC CISC requested a review from compilade May 15, 2026 12:03
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
@limloop

limloop commented May 15, 2026

Copy link
Copy Markdown
Contributor Author

@CISC, here's a model with non-standard expand:

limloop/whiff-mamba2-50M-v0.1
https://huggingface.co/limloop/whiff-mamba2-50M-v0.1

Config values:

  • expand = 1
  • d_model = 512
  • d_inner = 512 (computed as expand * d_model)

With current llama.cpp (the hardcoded 2 * n_embd == d_inner assert), this model fails to load.

With my changes (this PR), it loads and generates coherent text.

@CISC

CISC commented May 15, 2026

Copy link
Copy Markdown
Member

Thanks, rebase and adjust accordingly to refactor please (moved to conversion/mamba.py).

@limloop

limloop commented May 15, 2026

Copy link
Copy Markdown
Contributor Author

@CISC updated, ready for review

@CISC CISC left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the long delay, I had hoped @compilade would take a look.

@CISC CISC added the merge ready A maintainer can use this label to indicate that they consider the changes final and ready to merge. label Jun 25, 2026
@ggerganov ggerganov merged commit 960d628 into ggml-org:master Jun 26, 2026
21 of 27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

conversion merge ready A maintainer can use this label to indicate that they consider the changes final and ready to merge. model Model specific python python script changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants