models: fix Qwen3.5 dense/MoE load when MTP block is absent (trunk-only GGUF) by rohithj7 · Pull Request #25024 · ggml-org/llama.cpp

rohithj7 · 2026-06-26T00:03:26Z

Overview

Fixes loading of Qwen3.5 dense (Qwen3_5ForCausalLM) and MoE (Qwen3_5MoeForCausalLM) GGUFs that fail at load time with:

llama_model_load: error loading model: missing tensor 'blk.<N>.attn_norm.weight'

where <N> == num_hidden_layers (the first index past the trunk).

The converter writes block_count = num_hidden_layers + mtp_num_hidden_layers and a nextn_predict_layers key whenever config.json declares mtp_num_hidden_layers, even when the checkpoint contains no mtp.* weights. The runtime then derives n_layer_all = block_count and unconditionally constructs the trailing MTP/NextN block, marking blk.<N>.attn_norm.weight (and the other MTP tensors) as required. For a trunk-only GGUF this block is never present, so load aborts.

src/models/step35.cpp already handles this: it probes for the defining MTP tensor and, when absent, marks the MTP block tensors TENSOR_NOT_REQUIRED ("trunk-only"). This PR ports that same trunk_only handling to src/models/qwen35.cpp and src/models/qwen35moe.cpp, which previously hardcoded the MTP block tensors as required.

After the change:

Trunk-only GGUFs load and run normal inference (the MTP block is never executed in the main graph; n_layer() excludes nextn layers).
GGUFs that actually bundle the MTP block are unchanged - the tensors are still required and the speculative (graph_mtp) path keeps working.

Closes #24737.
Closes #24211.

Additional information

Same failure family reported in #24737 (Qwen3.5-4B, blk.32), #24211 (Nex N2 Pro / Qwen3.5 397B MoE, blk.60), and the Qwen3.5-122B MoE GGUF discussion (blk.48). The MTP-in-GGUF mapping and runtime were added in #20533 / #22673; the step35 trunk-only fix landed in #24340 but was not ported to the qwen35 loaders.

Requirements

I have read and agree with the contributing guidelines
AI usage disclosure: YES - I used an AI assistant to help me understand the issue and identify what needed to change, and to get a more thorough understanding of the relevant code. It helped me realize that step35 already had this change so I had to replicate that for qwen3.5. I made the changes myself. Further, I used AI to write this PR description.

…oe models

missing tensor issue

CISC · 2026-06-26T08:22:57Z

The converter writes block_count = num_hidden_layers + mtp_num_hidden_layers and a nextn_predict_layers key whenever config.json declares mtp_num_hidden_layers, even when the checkpoint contains no mtp.* weights. The runtime then derives n_layer_all = block_count and unconditionally constructs the trailing MTP/NextN block, marking blk.<N>.attn_norm.weight (and the other MTP tensors) as required. For a trunk-only GGUF this block is never present, so load aborts.

First of all a model's config.json should not declare MTP layers if it does not have any, this is a model bug. Failing to load such a GGUF is perfectly valid (and can be fixed by editing the config or using --no-mtp at conversion, alternatively update the GGUF with gguf-set-metadata).

Secondly, allowing this probably leads to other subtle issues as hparams.n_layer_all is now incorrect. In fact the correct fix is to remove this from step35.

Rohith Iyengar and others added 3 commits June 25, 2026 13:06

fix: update tensor loading logic for MTP layers in qwen35 and qwen35m…

4c0b1ad

…oe models

Merge pull request #1 from rohithj7/rohith/missing-tensor-issues

7d71418

missing tensor issue

Merge branch 'ggml-org:master' into master

e8bcb77

rohithj7 requested a review from CISC as a code owner June 26, 2026 00:03

github-actions Bot added the model Model specific label Jun 26, 2026

CISC closed this Jun 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

models: fix Qwen3.5 dense/MoE load when MTP block is absent (trunk-only GGUF)#25024

models: fix Qwen3.5 dense/MoE load when MTP block is absent (trunk-only GGUF)#25024
rohithj7 wants to merge 3 commits into
ggml-org:masterfrom
rohithj7:master

rohithj7 commented Jun 26, 2026 •

edited

Loading

Uh oh!

CISC commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

rohithj7 commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Additional information

Requirements

Uh oh!

CISC commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rohithj7 commented Jun 26, 2026 •

edited

Loading