fix: correct subtyping for sparse matrices in cuSPARSE by gdalle · Pull Request #3105 · JuliaGPU/CUDA.jl

gdalle · 2026-04-16T10:53:38Z

Fixes #3104

gdalle · 2026-04-16T11:14:19Z

Should I add tests?

kshyatt · 2026-04-16T11:40:07Z

Is this going to break the broadcasting tests, which rely on things being subtypes of the abstract GPUArrays types?

gdalle · 2026-04-16T11:41:34Z

No, this shouldn't be breaking because AbstractCuSparseMatrix itself subtypes AbstractGPUSparseMatrix

kshyatt · 2026-04-16T11:44:22Z

That wasn't what I asked. Is it going to break this logic?

gdalle · 2026-04-16T11:46:54Z

If I understand correctly, this line still applies to CuSparseMatrixCSC/CSR so all should be good?

https://github.com/JuliaGPU/GPUArrays.jl/blob/21d6bc6f3c3c8dc2a1d2dbd07c5eb177fdaf179d/src/host/sparse.jl#L290

kshyatt · 2026-04-16T11:47:23Z

I don't think everything will be fine because of this

gdalle · 2026-04-16T11:50:49Z

Oh right I understand now. Yeah, this will break.
I opened #3104 because a recent version of CUDA broke SparseMatrixColorings, where I assumed that all CUDA sparse matrices are subtypes of AbstractCuSparseMatrix (some even have the subtyping in their docstring by mistake). If that is no longer true because the CSC/CSR distinction happens at the GPUArrays level and we prefer safeguarding that hierarchy, what is AbstractCuSparseMatrix even for?

Missed that AbstractCuSparseMatrix doesn't have AbstractGPUSparseMatrixCSR in its hierarchy.

kshyatt · 2026-04-16T11:52:18Z

Right now it's kind of functioning as a dumping ground for things that aren't CSC and CSR 😅 . Ideally what we'd do is add broadcasting support for COO and BSR so we can fold them in to the wider AbstractGPUSparseMatrix*** system

gdalle · 2026-04-16T11:55:26Z

Okay then feel free to close this PR, and I'll make a note to not use AbstractCuSparseMatrix anymore (see JuliaDiff/SparseMatrixColorings.jl#314)

kshyatt · 2026-04-16T11:55:49Z

Sorry for the confusion here! It's kind of an ongoing process, as you can see...

github-actions

CUDA.jl Benchmarks

Details

Benchmark suite	Current: `2c11d9a`	Previous: `af6961a`	Ratio
`array/accumulate/Float32/1d`	`101885` ns	`101502.5` ns	`1.00`
`array/accumulate/Float32/dims=1`	`77180` ns	`76629.5` ns	`1.01`
`array/accumulate/Float32/dims=1L`	`1586558` ns	`1585691.5` ns	`1.00`
`array/accumulate/Float32/dims=2`	`144533.5` ns	`143840` ns	`1.00`
`array/accumulate/Float32/dims=2L`	`658693.5` ns	`657866.5` ns	`1.00`
`array/accumulate/Int64/1d`	`119291` ns	`119085` ns	`1.00`
`array/accumulate/Int64/dims=1`	`80352` ns	`79723` ns	`1.01`
`array/accumulate/Int64/dims=1L`	`1695194.5` ns	`1694532` ns	`1.00`
`array/accumulate/Int64/dims=2`	`156647` ns	`156114` ns	`1.00`
`array/accumulate/Int64/dims=2L`	`962826` ns	`961892` ns	`1.00`
`array/broadcast`	`20872` ns	`20629` ns	`1.01`
`array/construct`	`1267` ns	`1256.5` ns	`1.01`
`array/copy`	`18257` ns	`17886` ns	`1.02`
`array/copyto!/cpu_to_gpu`	`218082` ns	`212623` ns	`1.03`
`array/copyto!/gpu_to_cpu`	`283779.5` ns	`283142` ns	`1.00`
`array/copyto!/gpu_to_gpu`	`10733` ns	`10702` ns	`1.00`
`array/iteration/findall/bool`	`135007` ns	`134713` ns	`1.00`
`array/iteration/findall/int`	`150383.5` ns	`149574` ns	`1.01`
`array/iteration/findfirst/bool`	`82540` ns	`81484` ns	`1.01`
`array/iteration/findfirst/int`	`84719` ns	`83536` ns	`1.01`
`array/iteration/findmin/1d`	`88840` ns	`85967.5` ns	`1.03`
`array/iteration/findmin/2d`	`117471` ns	`116482` ns	`1.01`
`array/iteration/logical`	`200391` ns	`199753` ns	`1.00`
`array/iteration/scalar`	`67244` ns	`67105.5` ns	`1.00`
`array/permutedims/2d`	`52813` ns	`52568` ns	`1.00`
`array/permutedims/3d`	`53243` ns	`53057` ns	`1.00`
`array/permutedims/4d`	`52187` ns	`51868.5` ns	`1.01`
`array/random/rand/Float32`	`13794` ns	`12842` ns	`1.07`
`array/random/rand/Int64`	`26019` ns	`25396` ns	`1.02`
`array/random/rand!/Float32`	`8899.666666666666` ns	`8414.666666666666` ns	`1.06`
`array/random/rand!/Int64`	`22129` ns	`21981` ns	`1.01`
`array/random/randn/Float32`	`37987` ns	`37382.5` ns	`1.02`
`array/random/randn!/Float32`	`30899` ns	`30580` ns	`1.01`
`array/reductions/mapreduce/Float32/1d`	`34669` ns	`34346` ns	`1.01`
`array/reductions/mapreduce/Float32/dims=1`	`39945` ns	`39918.5` ns	`1.00`
`array/reductions/mapreduce/Float32/dims=1L`	`51321` ns	`51509` ns	`1.00`
`array/reductions/mapreduce/Float32/dims=2`	`56515` ns	`56563` ns	`1.00`
`array/reductions/mapreduce/Float32/dims=2L`	`69450.5` ns	`69236` ns	`1.00`
`array/reductions/mapreduce/Int64/1d`	`43066` ns	`42783` ns	`1.01`
`array/reductions/mapreduce/Int64/dims=1`	`51270` ns	`42869` ns	`1.20`
`array/reductions/mapreduce/Int64/dims=1L`	`87296` ns	`87545` ns	`1.00`
`array/reductions/mapreduce/Int64/dims=2`	`59402` ns	`59325` ns	`1.00`
`array/reductions/mapreduce/Int64/dims=2L`	`84700` ns	`85131` ns	`0.99`
`array/reductions/reduce/Float32/1d`	`34994` ns	`34338` ns	`1.02`
`array/reductions/reduce/Float32/dims=1`	`49664` ns	`48895` ns	`1.02`
`array/reductions/reduce/Float32/dims=1L`	`51712` ns	`51567` ns	`1.00`
`array/reductions/reduce/Float32/dims=2`	`56672` ns	`56622` ns	`1.00`
`array/reductions/reduce/Float32/dims=2L`	`69967` ns	`69625` ns	`1.00`
`array/reductions/reduce/Int64/1d`	`42893` ns	`42697` ns	`1.00`
`array/reductions/reduce/Int64/dims=1`	`42295` ns	`47176.5` ns	`0.90`
`array/reductions/reduce/Int64/dims=1L`	`87293` ns	`87361` ns	`1.00`
`array/reductions/reduce/Int64/dims=2`	`59586` ns	`59337` ns	`1.00`
`array/reductions/reduce/Int64/dims=2L`	`84606` ns	`84717` ns	`1.00`
`array/reverse/1d`	`17873` ns	`18051` ns	`0.99`
`array/reverse/1dL`	`68581` ns	`68628` ns	`1.00`
`array/reverse/1dL_inplace`	`65762` ns	`65835` ns	`1.00`
`array/reverse/1d_inplace`	`10334` ns	`8676.333333333334` ns	`1.19`
`array/reverse/2d`	`20742` ns	`20541` ns	`1.01`
`array/reverse/2dL`	`72870` ns	`72496` ns	`1.01`
`array/reverse/2dL_inplace`	`65853` ns	`65804` ns	`1.00`
`array/reverse/2d_inplace`	`9986` ns	`10104` ns	`0.99`
`array/sorting/1d`	`2736964` ns	`2734890.5` ns	`1.00`
`array/sorting/2d`	`1070069` ns	`1069333` ns	`1.00`
`array/sorting/by`	`3305529.5` ns	`3305442` ns	`1.00`
`cuda/synchronization/context/auto`	`1143.3` ns	`1142.4` ns	`1.00`
`cuda/synchronization/context/blocking`	`923.5405405405405` ns	`918.9428571428572` ns	`1.01`
`cuda/synchronization/context/nonblocking`	`7128.8` ns	`7080.9` ns	`1.01`
`cuda/synchronization/stream/auto`	`1001.5714285714286` ns	`983.3846153846154` ns	`1.02`
`cuda/synchronization/stream/blocking`	`823.6436781609195` ns	`819.060606060606` ns	`1.01`
`cuda/synchronization/stream/nonblocking`	`7053.8` ns	`7951.1` ns	`0.89`
`integration/byval/reference`	`143725` ns	`143938` ns	`1.00`
`integration/byval/slices=1`	`145783` ns	`145713` ns	`1.00`
`integration/byval/slices=2`	`284767` ns	`284653` ns	`1.00`
`integration/byval/slices=3`	`422924` ns	`422978` ns	`1.00`
`integration/cudadevrt`	`102332` ns	`102487` ns	`1.00`
`integration/volumerhs`	`23447510` ns	`23490453` ns	`1.00`
`kernel/indexing`	`13255` ns	`13409` ns	`0.99`
`kernel/indexing_checked`	`13922` ns	`14040` ns	`0.99`
`kernel/launch`	`2139.4444444444443` ns	`2267.4444444444443` ns	`0.94`
`kernel/occupancy`	`706.8611111111111` ns	`828.1463414634146` ns	`0.85`
`kernel/rand`	`14503` ns	`14466` ns	`1.00`
`latency/import`	`3805709559` ns	`3809615751` ns	`1.00`
`latency/precompile`	`4569868964` ns	`4560767959.5` ns	`1.00`
`latency/ttfp`	`4380765383` ns	`4401798424` ns	`1.00`

This comment was automatically generated by workflow using github-action-benchmark.

fix: correct subtyping for sparse matrices in cuSPARSE

2c11d9a

maleadt previously approved these changes Apr 16, 2026

View reviewed changes

gdalle marked this pull request as draft April 16, 2026 11:50

github-actions Bot reviewed Apr 16, 2026

View reviewed changes

gdalle closed this Apr 18, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: correct subtyping for sparse matrices in cuSPARSE#3105

fix: correct subtyping for sparse matrices in cuSPARSE#3105
gdalle wants to merge 1 commit intoJuliaGPU:masterfrom
gdalle:gd/subtyping

gdalle commented Apr 16, 2026

Uh oh!

gdalle commented Apr 16, 2026

Uh oh!

kshyatt commented Apr 16, 2026

Uh oh!

gdalle commented Apr 16, 2026

Uh oh!

kshyatt commented Apr 16, 2026

Uh oh!

gdalle commented Apr 16, 2026

Uh oh!

kshyatt commented Apr 16, 2026

Uh oh!

gdalle commented Apr 16, 2026

Uh oh!

kshyatt commented Apr 16, 2026

Uh oh!

gdalle commented Apr 16, 2026

Uh oh!

kshyatt commented Apr 16, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

gdalle commented Apr 16, 2026

Uh oh!

gdalle commented Apr 16, 2026

Uh oh!

kshyatt commented Apr 16, 2026

Uh oh!

gdalle commented Apr 16, 2026

Uh oh!

kshyatt commented Apr 16, 2026

Uh oh!

gdalle commented Apr 16, 2026

Uh oh!

kshyatt commented Apr 16, 2026

Uh oh!

gdalle commented Apr 16, 2026

Uh oh!

kshyatt commented Apr 16, 2026

Uh oh!

gdalle commented Apr 16, 2026

Uh oh!

kshyatt commented Apr 16, 2026

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

CUDA.jl Benchmarks

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants