cuda: sanitize invalid Blackwell sharedMemPerBlockOptin by wgu9 · Pull Request #24991 · ggml-org/llama.cpp

wgu9 · 2026-06-25T01:53:20Z

Some Blackwell CUDA driver/device combinations can report an invalid sharedMemPerBlockOptin value. Sanitize that value during CUDA device initialization and fall back to sharedMemPerBlock when the opt-in value is zero or larger than sharedMemPerMultiprocessor.

Validation:

RTX 5090
SM120 CUDA build passed
test-backend-ops CUDA0 MUL_MAT passed 1134/1134

am17an · 2026-06-25T11:30:42Z

@ggml-org/nvidia there have been multiple PRs which attempt to "fix" this issue. I'm now wondering if this is a real issue

wgu9 · 2026-06-26T05:08:39Z

Thanks for calling that out. I agree this should not merge unless the device-property issue is real and this PR is not just another speculative Blackwell workaround.

What I verified before opening this:

This came from an RTX 5090 / SM120 CUDA build where sharedMemPerBlockOptin was reported outside the usable per-SM limit during ggml CUDA init.
Falling back to sharedMemPerBlock let CUDA initialization continue with a conservative value instead of propagating an invalid opt-in shared-memory size into later launch/resource decisions.
After the guard, my local CUDA validation passed: SM120 CUDA build and test-backend-ops CUDA0 MUL_MAT passed 1134/1134.
I also searched current open and closed PRs/issues for the same sharedMemPerBlockOptin / sharedMemPerMultiprocessor guard and did not find a direct duplicate.

If the NVIDIA maintainers think the driver/device report should be treated as impossible or fixed lower in the stack, I am fine closing this. The intent here is only to add a narrow defensive guard around an invalid device property, not to mask unrelated Blackwell issues.

cuda : sanitize invalid Blackwell smpbo values

b01b4fa

wgu9 requested a review from a team as a code owner June 25, 2026 01:53

github-actions Bot added ggml changes relating to the ggml tensor library for machine learning CUDA Related to the CUDA backend labels Jun 25, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

cuda: sanitize invalid Blackwell sharedMemPerBlockOptin#24991

cuda: sanitize invalid Blackwell sharedMemPerBlockOptin#24991
wgu9 wants to merge 1 commit into
ggml-org:masterfrom
wgu9:fix-cuda-blackwell-smpbo-sanitize

wgu9 commented Jun 25, 2026

Uh oh!

am17an commented Jun 25, 2026 •

edited

Loading

Uh oh!

wgu9 commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

wgu9 commented Jun 25, 2026

Uh oh!

am17an commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wgu9 commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

am17an commented Jun 25, 2026 •

edited

Loading