Skip to content

vulkan: use flops instead of weight tensor size for submission heuristic#25005

Open
0cc4m wants to merge 3 commits into
masterfrom
0cc4m/vulkan-submission-threshold-flops
Open

vulkan: use flops instead of weight tensor size for submission heuristic#25005
0cc4m wants to merge 3 commits into
masterfrom
0cc4m/vulkan-submission-threshold-flops

Conversation

@0cc4m

@0cc4m 0cc4m commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Overview

My guess for the long-running issues with DeviceLost errors on AMD and Intel due to submission timeouts is that we currently only treat matmuls as special in the submission batching logic, by taking the combined size of their weight matrices into account. But Flash Attention and convolutions are also very heavy operations that should be considered here. This doesn't work in the same way, so in this PR I'm trying to use FLOPs instead of weight matrix size weights for submission estimation. That should be easier to expand to other operators that might come in in the future.

This also fixes a bug where previously uint64_t mul_mat_bytes_per_submit = std::min(uint64_t(100*1000*1000), ctx->last_total_mul_mat_bytes / 40u); would always submit each operator separately on the first run, because ctx->last_total_mul_mat_bytes would start out as 0, so always be smaller than 100 MiB.

I'm still trying to reproduce the DeviceLost error on one of my devices.

Requirements

  • I have read and agree with the contributing guidelines
  • AI usage disclosure: YES, Claude was used for assistance, code was reviewed and tested by me.

@0cc4m 0cc4m requested a review from a team as a code owner June 25, 2026 13:02
@github-actions github-actions Bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Jun 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants