Skip to content

perf: Qwen Image Optimize.#1242

Open
shan-chen-feng wants to merge 2 commits intojd-opensource:release/v0.9.0from
shan-chen-feng:090_optimize
Open

perf: Qwen Image Optimize.#1242
shan-chen-feng wants to merge 2 commits intojd-opensource:release/v0.9.0from
shan-chen-feng:090_optimize

Conversation

@shan-chen-feng
Copy link
Copy Markdown
Collaborator

No description provided.

@shan-chen-feng shan-chen-feng changed the title perf: Quentin perf: Qwen Image Optimize. Apr 9, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant refactoring to the attention mechanism within the Qwen image editing pipeline, particularly for NPU environments, to enable communication and computation overlap for sequence parallelism. Key changes include updating the dit_sp_communication_overlap flag from an integer to a boolean, centralizing the management and application of Rotary Positional Embeddings (RoPE) in the QwenImageEditPlusPipelineImpl, and refactoring the attention processor into a base class with two derived implementations: one for standard processing and another (QwenDoubleStreamAttnProcessorCMO2_0Impl) specifically designed for communication-computation overlap using all_to_all_4D operations and npu_fusion_attention. The choice between these processors is now dynamic based on the new boolean flag. Additionally, the AttentionImpl class has been enhanced to explicitly track query and key-value heads and head dimensions, and its output projection structure has been simplified. There are no review comments to address.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant