Skip to content

cuda : prevent integer truncation and overflow errors when using KQ mask strides in flash_attn_mask_to_KV_max kernel#24945

Open
fairydreaming wants to merge 1 commit into
ggml-org:masterfrom
fairydreaming:stride-narrow-conv-fix
Open

cuda : prevent integer truncation and overflow errors when using KQ mask strides in flash_attn_mask_to_KV_max kernel#24945
fairydreaming wants to merge 1 commit into
ggml-org:masterfrom
fairydreaming:stride-narrow-conv-fix