Conversation
TracedRScalarTracedRNumber
246cd00 to
40af781
Compare
113c2b5 to
10495fc
Compare
There was a problem hiding this comment.
Reactant.jl Benchmarks
Details
| Benchmark suite | Current: 8a9f06c | Previous: f2c0e8a | Ratio |
|---|---|---|---|
ViT base (256 x 256 x 3 x 32)/forward/CUDA/Reactant |
1318556698 ns |
1315729546 ns |
1.00 |
ViT base (256 x 256 x 3 x 32)/forward/CUDA/Lux |
213965204 ns |
212083499 ns |
1.01 |
ViT base (256 x 256 x 3 x 32)/forward/CPU/Reactant |
6804934419 ns |
5286469750 ns |
1.29 |
ViT base (256 x 256 x 3 x 32)/forward/CPU/Lux |
18511487331 ns |
23583347555 ns |
0.78 |
ViT small (256 x 256 x 3 x 4)/forward/CUDA/Reactant |
1255068856 ns |
1254858296 ns |
1.00 |
ViT small (256 x 256 x 3 x 4)/forward/CUDA/Lux |
8751128 ns |
8478570 ns |
1.03 |
ViT small (256 x 256 x 3 x 4)/forward/CPU/Reactant |
1643973338 ns |
1636237670 ns |
1.00 |
ViT small (256 x 256 x 3 x 4)/forward/CPU/Lux |
2060290806 ns |
2376437823 ns |
0.87 |
ViT tiny (256 x 256 x 3 x 32)/forward/CUDA/Reactant |
1304838533 ns |
1266018905 ns |
1.03 |
ViT tiny (256 x 256 x 3 x 32)/forward/CUDA/Lux |
93074730.5 ns |
84820407 ns |
1.10 |
ViT tiny (256 x 256 x 3 x 32)/forward/CPU/Reactant |
2253189844 ns |
2170879105 ns |
1.04 |
ViT tiny (256 x 256 x 3 x 32)/forward/CPU/Lux |
6064963189 ns |
4675094299 ns |
1.30 |
ViT tiny (256 x 256 x 3 x 4)/forward/CUDA/Reactant |
1297977475 ns |
1263496480 ns |
1.03 |
ViT tiny (256 x 256 x 3 x 4)/forward/CUDA/Lux |
7525710.5 ns |
7782824 ns |
0.97 |
ViT tiny (256 x 256 x 3 x 4)/forward/CPU/Reactant |
1466869310 ns |
1467043032.5 ns |
1.00 |
ViT tiny (256 x 256 x 3 x 4)/forward/CPU/Lux |
1361212717 ns |
1685775445 ns |
0.81 |
ViT tiny (256 x 256 x 3 x 16)/forward/CUDA/Reactant |
1314689632 ns |
1306815930 ns |
1.01 |
ViT tiny (256 x 256 x 3 x 16)/forward/CUDA/Lux |
11418734.5 ns |
11611908 ns |
0.98 |
ViT tiny (256 x 256 x 3 x 16)/forward/CPU/Reactant |
1752787751.5 ns |
1752808523 ns |
1.00 |
ViT tiny (256 x 256 x 3 x 16)/forward/CPU/Lux |
2629684857 ns |
2463987825.5 ns |
1.07 |
ViT small (256 x 256 x 3 x 16)/forward/CUDA/Reactant |
1276847603.5 ns |
1325877558.5 ns |
0.96 |
ViT small (256 x 256 x 3 x 16)/forward/CUDA/Lux |
86382264 ns |
90330187 ns |
0.96 |
ViT small (256 x 256 x 3 x 16)/forward/CPU/Reactant |
2216207055 ns |
2213119086 ns |
1.00 |
ViT small (256 x 256 x 3 x 16)/forward/CPU/Lux |
3548437058 ns |
4023816395 ns |
0.88 |
ViT small (256 x 256 x 3 x 32)/forward/CUDA/Reactant |
1308980482.5 ns |
1270812264 ns |
1.03 |
ViT small (256 x 256 x 3 x 32)/forward/CUDA/Lux |
116489754.5 ns |
113097539 ns |
1.03 |
ViT small (256 x 256 x 3 x 32)/forward/CPU/Reactant |
3037687031 ns |
3042643080 ns |
1.00 |
ViT small (256 x 256 x 3 x 32)/forward/CPU/Lux |
9576755622 ns |
8210106924.5 ns |
1.17 |
ViT base (256 x 256 x 3 x 16)/forward/CUDA/Reactant |
1332370009 ns |
1324054039 ns |
1.01 |
ViT base (256 x 256 x 3 x 16)/forward/CUDA/Lux |
126824571 ns |
127669686.5 ns |
0.99 |
ViT base (256 x 256 x 3 x 16)/forward/CPU/Reactant |
3191878044 ns |
3203794253 ns |
1.00 |
ViT base (256 x 256 x 3 x 16)/forward/CPU/Lux |
7124639562 ns |
11004907984 ns |
0.65 |
ViT base (256 x 256 x 3 x 4)/forward/CUDA/Reactant |
1280201506 ns |
1299288245 ns |
0.99 |
ViT base (256 x 256 x 3 x 4)/forward/CUDA/Lux |
83942324 ns |
96277750 ns |
0.87 |
ViT base (256 x 256 x 3 x 4)/forward/CPU/Reactant |
1900998509 ns |
2155333265.5 ns |
0.88 |
ViT base (256 x 256 x 3 x 4)/forward/CPU/Lux |
2374950424 ns |
2863535293.5 ns |
0.83 |
This comment was automatically generated by workflow using github-action-benchmark.
Benchmark Results
Benchmark PlotsA plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR. |
|
Let me just merge #163 after CI checks pass and rebase this PR on top of it because it adds some tests that should be passed by |
wsmoses
left a comment
There was a problem hiding this comment.
Lgtm only minor comment: would defining traced value = union{tracedrarrsy, travednumber} simplify some things
|
Do we also need a ConcreteRScalar? If we want to extend |
|
For now we can leave concrete as is |
|
I don't think we need a concrete scalar type, because indexing on a concrete array is just a normal number. |
I meant more from a tracing perspective. What should |
|
So I think this is two different questions.
For now I'm fine leaving them as is which is compile as constants
For example we may want to have a function with a user defined index offset. e.g. without a concrete scalar we need to recompile for each index. I do think we need such a scalar, even if the default behavior for conversion is only converting arrays |
|
Even more simply though, we may need as the return type of a function with array inputs and a scalar output. i.e. |
|
Looking at the above example I also realized the mapreduce semantics is incorrect at-present. For example
We could store it like an array in the ConcreteRArray but expose it to the enduser as a scalar. JuliaGPU/GPUArrays.jl#550 is something similar in the GPUArrays world. But let's keep this PR simple and avoid this for now |
|
Codecov ReportAttention: Patch coverage is
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #161 +/- ##
==========================================
- Coverage 33.09% 30.62% -2.47%
==========================================
Files 37 38 +1
Lines 5107 5175 +68
==========================================
- Hits 1690 1585 -105
- Misses 3417 3590 +173 ☔ View full report in Codecov by Sentry. |
wsmoses
left a comment
There was a problem hiding this comment.
a small comment but otherwise lgtm
|
There's a bug on concatenation of using Reactant
x = fill(true)
x_concrete = Reactant.to_rarray(x)
function traced_vcat(x)
a = x[]
[a; a; a]
end
f = @compile traced_vcat(x_concrete)
f(x_concrete)It fails with the following error: ERROR: BoundsError: attempt to access 3-element Vector{ConcreteRArray{Bool}} at index [1]
Stacktrace:
[1] traced_getfield
@ ~/Developer/Reactant.jl/src/Compiler.jl:18 [inlined]
[2] macro expansion
@ ~/Developer/Reactant.jl/src/Compiler.jl:649 [inlined]
[3] (::Reactant.Compiler.Thunk{Symbol("##test_vcat_reactant#229")})(args::ConcreteRArray{Bool, 0})
@ Reactant.Compiler ~/Developer/Reactant.jl/src/Compiler.jl:665
[4] top-level scope
@ REPL[10]:1The error seems to be that it's not doing any XLA call and it's setting quote
#= /Users/mofeing/Developer/Reactant.jl/src/Compiler.jl:639 =#
#= /Users/mofeing/Developer/Reactant.jl/src/Compiler.jl:640 =#
nothing
#= /Users/mofeing/Developer/Reactant.jl/src/Compiler.jl:647 =#
usbuf_1 = (getindex(args, 1)).data
sbuf_1 = XLA.synced_buffer(usbuf_1)
#= /Users/mofeing/Developer/Reactant.jl/src/Compiler.jl:648 =#
()
#= /Users/mofeing/Developer/Reactant.jl/src/Compiler.jl:649 =#
result = (ConcreteRArray{Bool})[ConcreteRArray{Bool, 0}(Reactant.XLA.AsyncBuffer(Reactant.XLA.Buffer(Ptr{Nothing} @0x0000000000000000), nothing), ()), ConcreteRArray{Bool, 0}(Reactant.XLA.AsyncBuffer(Reactant.XLA.Buffer(Ptr{Nothing} @0x0000000000000000), nothing), ()), ConcreteRArray{Bool, 0}(Reactant.XLA.AsyncBuffer(Reactant.XLA.Buffer(Ptr{Nothing} @0x0000000000000000), nothing), ())]
(traced_getfield(result, $(Expr(:quote, 1)))).data = (args[1]).data
(traced_getfield(result, $(Expr(:quote, 2)))).data = (args[1]).data
(traced_getfield(result, $(Expr(:quote, 3)))).data = (args[1]).data
#= /Users/mofeing/Developer/Reactant.jl/src/Compiler.jl:650 =#
return result
end |
|
That actually makes sense here since it’s making an array of values all
available as args.
What’s the issue?
…On Sat, Oct 5, 2024 at 9:11 PM Sergio Sánchez Ramírez < ***@***.***> wrote:
There's a bug on concatenation of TracedRNumber. In particular, if we run
this kernel:
using Reactant
x = fill(true)
x_concrete = Reactant.to_rarray(x)
function traced_vcat(x)
a = x[]
[a; a; a]end
f = @compile traced_vcat(x_concrete)
f(x_concrete)
It fails with the following error:
ERROR: BoundsError: attempt to access 3-element Vector{ConcreteRArray{Bool}} at index [1]
Stacktrace:
[1] traced_getfield
@ ~/Developer/Reactant.jl/src/Compiler.jl:18 [inlined]
[2] macro expansion
@ ~/Developer/Reactant.jl/src/Compiler.jl:649 [inlined]
[3] (::Reactant.Compiler.Thunk{Symbol("##test_vcat_reactant#229")})(args::ConcreteRArray{Bool, 0})
@ Reactant.Compiler ~/Developer/Reactant.jl/src/Compiler.jl:665
[4] top-level scope
@ REPL[10]:1
The error seems to be that it's not doing any XLA call and it's setting
result returning to 3 empty buffers. Check out the generated Julia code
of f:
quote
#= /Users/mofeing/Developer/Reactant.jl/src/Compiler.jl:639 =#
#= /Users/mofeing/Developer/Reactant.jl/src/Compiler.jl:640 =#
nothing
#= /Users/mofeing/Developer/Reactant.jl/src/Compiler.jl:647 =#
usbuf_1 = (getindex(args, 1)).data
sbuf_1 = XLA.synced_buffer(usbuf_1)
#= /Users/mofeing/Developer/Reactant.jl/src/Compiler.jl:648 =#
()
#= /Users/mofeing/Developer/Reactant.jl/src/Compiler.jl:649 =#
result = (ConcreteRArray{Bool})[ConcreteRArray{Bool, 0}(Reactant.XLA.AsyncBuffer(Reactant.XLA.Buffer(Ptr{Nothing} @0x0000000000000000), nothing), ()), ConcreteRArray{Bool, 0}(Reactant.XLA.AsyncBuffer(Reactant.XLA.Buffer(Ptr{Nothing} @0x0000000000000000), nothing), ()), ConcreteRArray{Bool, 0}(Reactant.XLA.AsyncBuffer(Reactant.XLA.Buffer(Ptr{Nothing} @0x0000000000000000), nothing), ())]
(traced_getfield(result, $(Expr(:quote, 1)))).data = (args[1]).data
(traced_getfield(result, $(Expr(:quote, 2)))).data = (args[1]).data
(traced_getfield(result, $(Expr(:quote, 3)))).data = (args[1]).data
#= /Users/mofeing/Developer/Reactant.jl/src/Compiler.jl:650 =#
return resultend
—
Reply to this email directly, view it on GitHub
<#161 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJTUXFKSBYO4L2Z573JTE3Z2CL4RAVCNFSM6AAAAABPMI4XLGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOJVGI3DCOJQGU>
.
You are receiving this because your review was requested.Message ID:
***@***.***>
|
I don't get it. Its behavior should be like the one on 0-dim array test since we are passing a 0-dim array and should return a vector. From the MLIR point of view, it should be calling |
|
ah I see, I think this is a bug on the new number and cat perhaps? |
|
I think we won't do any XLA call here, see the julia fallbacks for number: vcat(X::Number...) = hvcat_fill!(Vector{promote_typeof(X...)}(undef, length(X)), X)
hcat(X::Number...) = hvcat_fill!(Matrix{promote_typeof(X...)}(undef, 1,length(X)), X)It is going to fill into a regular array |
|
Should work now julia> function traced_vcat(x)
a = x[];
Float64[a; a; a]
end
traced_vcat (generic function with 1 method)
julia> @code_hlo optimize=false traced_vcat(x_concrete)
Module:
module {
func.func @main(%arg0: tensor<i1>) -> (tensor<3xf64>, tensor<i1>) {
%0 = stablehlo.transpose %arg0, dims = [] : (tensor<i1>) -> tensor<i1>
%1 = stablehlo.broadcast_in_dim %0, dims = [] : (tensor<i1>) -> tensor<1xi1>
%2 = stablehlo.broadcast_in_dim %0, dims = [] : (tensor<i1>) -> tensor<1xi1>
%3 = stablehlo.broadcast_in_dim %0, dims = [] : (tensor<i1>) -> tensor<1xi1>
%4 = stablehlo.convert %1 : (tensor<1xi1>) -> tensor<1xf64>
%5 = stablehlo.convert %2 : (tensor<1xi1>) -> tensor<1xf64>
%6 = stablehlo.convert %3 : (tensor<1xi1>) -> tensor<1xf64>
%7 = stablehlo.concatenate %4, %5, %6, dim = 0 : (tensor<1xf64>, tensor<1xf64>, tensor<1xf64>) -> tensor<3xf64>
%8 = stablehlo.transpose %7, dims = [0] : (tensor<3xf64>) -> tensor<3xf64>
%9 = stablehlo.transpose %0, dims = [] : (tensor<i1>) -> tensor<i1>
return %8, %9 : tensor<3xf64>, tensor<i1>
}
}
julia> @code_hlo traced_vcat(x_concrete)
Module:
module attributes {transform.with_named_sequence} {
func.func @main(%arg0: tensor<i1>) -> tensor<3xf64> {
%0 = stablehlo.convert %arg0 : (tensor<i1>) -> tensor<f64>
%1 = stablehlo.broadcast_in_dim %0, dims = [] : (tensor<f64>) -> tensor<3xf64>
return %1 : tensor<3xf64>
}
} |
currently very WIP