Skip to content

[bugfix] Fix mtp fp8#35

Merged
Jintao-Huang merged 9 commits intomodelscope:mainfrom
Jintao-Huang:fix_mtp_fp8
Apr 17, 2026
Merged

[bugfix] Fix mtp fp8#35
Jintao-Huang merged 9 commits intomodelscope:mainfrom
Jintao-Huang:fix_mtp_fp8

Conversation

@Jintao-Huang
Copy link
Copy Markdown
Collaborator

No description provided.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the Multi-Token Prediction (MTP) layer by moving the patched logic from patcher.py into a dedicated MultiTokenPredictionLayer class in src/mcore_bridge/model/modules/mtp_layer.py. It also introduces _fp8_skip_modules tracking in the bridge and model conversion logic. Feedback highlights potential AttributeError due to undefined parallel state attributes, dead code regarding the unused _fp8_skip_modules set, and performance concerns related to local imports within the forward pass.

Comment thread src/mcore_bridge/model/modules/mtp_layer.py
Comment thread src/mcore_bridge/bridge/gpt_bridge.py
Comment thread src/mcore_bridge/model/gpts/qwen3_next.py
Comment thread src/mcore_bridge/model/modules/mtp_layer.py
Comment thread src/mcore_bridge/model/modules/mtp_layer.py Outdated
@Jintao-Huang
Copy link
Copy Markdown
Collaborator Author

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the Multi-Token Prediction (MTP) implementation by replacing monkey-patching with a dedicated MultiTokenPredictionLayer subclass that includes FP8 support. It also updates model registration and bridge logic to accommodate these changes. Feedback highlights a potential logic error in sequence parallel handling that could lead to incorrect tensor sharding, identifies dead code related to the unused _fp8_skip_modules variable, and suggests correcting the forward method's docstring to match its actual return type.

Comment thread src/mcore_bridge/model/modules/mtp_layer.py
Comment thread src/mcore_bridge/bridge/gpt_bridge.py
Comment thread src/mcore_bridge/model/modules/mtp_layer.py Outdated
@Jintao-Huang Jintao-Huang merged commit 2ba71e8 into modelscope:main Apr 17, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants