mamba2: remove hardcoded 2x expansion factor and invalid d_inner % d_state check#23082
Conversation
|
Hi @limloop, thanks for your contribution! Per our contribution guidelines, the automated PR checker found the following issue(s) that need your attention:
Please note that maintainers reserve the right to make final decisions on PRs. If you believe there is a mistake, please comment below. |
CISC
left a comment
There was a problem hiding this comment.
Do you have links to models with differing expand?
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
|
@CISC, here's a model with non-standard limloop/whiff-mamba2-50M-v0.1 Config values:
With current With my changes (this PR), it loads and generates coherent text. |
|
Thanks, rebase and adjust accordingly to refactor please (moved to |
|
@CISC updated, ready for review |
CISC
left a comment
There was a problem hiding this comment.
Sorry for the long delay, I had hoped @compilade would take a look.
This PR removes two unnecessary restrictions in Mamba2 that prevent loading models with custom architectures.
Changes:
Remove hardcoded 2x expansion factor (
GGML_ASSERT(2 * n_embd == d_inner))expand=2expandis not stored in GGUF, onlyd_innerisRemove invalid
d_inner % d_statecheckd_innerandd_stateare unrelated parametersTesting:
expand=2) — loads and runs correctlyexpand=1,d_inner=512,d_model=512) — loads and generates coherent outputBackward compatibility: Models with
expand=2work identically to before.Related discussion: #21346