support mtp_shared_weights by Jintao-Huang · Pull Request #41 · modelscope/mcore-bridge

Jintao-Huang · 2026-04-19T16:26:32Z

gemini-code-assist

Code Review

This pull request introduces support for shared weights in Multi-Token Prediction (MTP) by decoupling the number of physical layers from the number of unroll steps. Key changes include updating ModelConfig to handle weight sharing logic, modifying gpt_model.py to use unroll steps for loss calculation and state chunking, and updating the MTP layer and patcher to support dynamic layer indexing for rotary embeddings. Feedback suggests refining the initialization logic to prevent accidental activation of MTP when layers are zero and explicitly declaring mtp_unroll_steps in the configuration dataclass.

MDR-EX1000 · 2026-04-20T02:36:45Z

Any related update to MS-SWIFT? I suggest that the mtp_unroll_steps should be reserved in metadata (i.e. config.json)

Jintao-Huang · 2026-04-20T02:39:02Z

#41

Jintao-Huang · 2026-04-20T02:41:54Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces support for Multi-Token Prediction (MTP) with shared weights by adding mtp_shared_weights and mtp_unroll_steps to the configuration. The logic in gpt_model.py and patcher.py has been updated to use mtp_unroll_steps for loop iterations and loss scaling, allowing the model to unroll more steps than the number of physical layers when weights are shared. Feedback highlights the need to explicitly declare mtp_unroll_steps as a field in the ModelConfig dataclass and to improve the robustness of its initialization to avoid potential TypeError when mtp_num_layers is missing.

gemini-code-assist · 2026-04-20T02:43:19Z

+        if self.mtp_shared_weights:
+            assert self.mtp_num_layers is not None
+            self.mtp_unroll_steps = self.mtp_num_layers
+            self.mtp_num_layers = 1
+        else:
+            self.mtp_unroll_steps = self.mtp_num_layers


The initialization logic for mtp_unroll_steps should be more robust. If mtp_num_layers is None and mtp_shared_weights is False, mtp_unroll_steps will be assigned None, which will cause a TypeError in gpt_model.py (line 423) when calculating 1 + self.config.mtp_unroll_steps. Additionally, it is safer to ensure mtp_num_layers is positive when weight sharing is enabled.

Suggested change

if self.mtp_shared_weights:

assert self.mtp_num_layers is not None

self.mtp_unroll_steps = self.mtp_num_layers

self.mtp_num_layers = 1

else:

self.mtp_unroll_steps = self.mtp_num_layers

if self.mtp_shared_weights:

assert self.mtp_num_layers is not None and self.mtp_num_layers > 0, \

"mtp_num_layers must be > 0 when mtp_shared_weights is True"

self.mtp_unroll_steps = self.mtp_num_layers

self.mtp_num_layers = 1

else:

self.mtp_unroll_steps = self.mtp_num_layers or 0

support mtp_shared_weights

578e9fb

gemini-code-assist Bot reviewed Apr 19, 2026

View reviewed changes

Comment thread src/mcore_bridge/config/model_config.py

Comment thread src/mcore_bridge/config/model_config.py

hjh0119 approved these changes Apr 20, 2026

View reviewed changes

fix

8627bd9

gemini-code-assist Bot reviewed Apr 20, 2026

View reviewed changes

Jintao-Huang added 2 commits April 20, 2026 11:10

fix

059293f

fix

7b8da57

Jintao-Huang merged commit c8b877c into modelscope:main Apr 20, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support mtp_shared_weights#41

support mtp_shared_weights#41
Jintao-Huang merged 4 commits intomodelscope:mainfrom
Jintao-Huang:support_mtp_shared_weights

Jintao-Huang commented Apr 19, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

MDR-EX1000 commented Apr 20, 2026

Uh oh!

Jintao-Huang commented Apr 20, 2026

Uh oh!

Jintao-Huang commented Apr 20, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

gemini-code-assist Bot Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Jintao-Huang commented Apr 19, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

MDR-EX1000 commented Apr 20, 2026

Uh oh!

Jintao-Huang commented Apr 20, 2026

Uh oh!

Jintao-Huang commented Apr 20, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

gemini-code-assist Bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants