Update to OrdinaryDiffEq.jl v7 and related SciML packages#2910
Update to OrdinaryDiffEq.jl v7 and related SciML packages#2910github-actions[bot] wants to merge 51 commits intomainfrom
Conversation
5d38e62 to
4a12427
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2910 +/- ##
==========================================
- Coverage 97.13% 97.10% -0.03%
==========================================
Files 625 626 +1
Lines 48514 48544 +30
==========================================
+ Hits 47122 47137 +15
- Misses 1392 1407 +15
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
JoshuaLampert
left a comment
There was a problem hiding this comment.
Needs dependencies like SummationByPartsOperators.jl to be updated first.
|
We need jlchan/StartUpDG.jl#219 to be merged and released before. |
Co-authored-by: Copilot <copilot@github.com>
|
The current Windows MPI job still seems to hang, but Ubuntu now runs. It's also interesting that in this PR CI was consistently hanging while in other PRs I think we didn't see it. |
I am wondering if it has to do something with it being on CI. |
Looks like this was too early. Now it also seems to hang, but at a later stage than before. |
|
On AArch64 it is: |
|
Yes, that is because for some reason the OrdinaryDiffEqCore.jl extension is not loaded, see https://github.com/trixi-framework/Trixi.jl/actions/runs/25315807694/job/74212954070?pr=2910#step:7:1514. Content of `stderr`:
Fatal error in internal_Init_thread: Other MPI error, error stack:
internal_Init_thread(71): MPI_Init_thread(argc=0x0, argv=0x0, required=1, provided=0x10a996010) failed
MPII_Init_thread(203)...:
MPIR_pmi_init(150)......:
pmi1_init(24)...........: PMI_Get_appnum returned -1
┌ Error: Error during loading of extension TrixiOrdinaryDiffEqCoreExt of Trixi, use `Base.retry_load_extensions()` to retry.
│ exception =
│ 1-element ExceptionStack:
│ Failed to precompile TrixiOrdinaryDiffEqCoreExt [7d8822bd-13ad-5232-9571-183eca7c0a3b] to "/Users/runner/.julia/compiled/v1.10/TrixiOrdinaryDiffEqCoreExt/jl_N1hfJh".
│ Stacktrace:
│ [1] error(s::String)
│ @ Base ./error.jl:35
│ [2] compilecache(pkg::Base.PkgId, path::String, internal_stderr::IO, internal_stdout::IO, keep_loaded_modules::Bool; loadable_exts::Vector{Base.PkgId})
│ @ Base ./loading.jl:2547
│ [3] compilecache
│ @ ./loading.jl:2414 [inlined]
│ [4] (::Base.var"#973#974"{Base.PkgId})()
│ @ Base ./loading.jl:2058
│ [5] mkpidlock(f::Base.var"#973#974"{Base.PkgId}, at::String, pid::Int32; kwopts::@Kwargs{stale_age::Int64, wait::Bool})
│ @ FileWatching.Pidfile ~/hostedtoolcache/julia/1.10.11/aarch64/share/julia/stdlib/v1.10/FileWatching/src/pidfile.jl:93
│ [6] #mkpidlock#6
│ @ ~/hostedtoolcache/julia/1.10.11/aarch64/share/julia/stdlib/v1.10/FileWatching/src/pidfile.jl:88 [inlined]
│ [7] trymkpidlock(::Function, ::Vararg{Any}; kwargs::@Kwargs{stale_age::Int64})
│ @ FileWatching.Pidfile ~/hostedtoolcache/julia/1.10.11/aarch64/share/julia/stdlib/v1.10/FileWatching/src/pidfile.jl:111
│ [8] #invokelatest#2
│ @ ./essentials.jl:894 [inlined]
│ [9] invokelatest
│ @ ./essentials.jl:889 [inlined]
│ [10] maybe_cachefile_lock(f::Base.var"#973#974"{Base.PkgId}, pkg::Base.PkgId, srcpath::String; stale_age::Int64)
│ @ Base ./loading.jl:3062
│ [11] maybe_cachefile_lock
│ @ ./loading.jl:3059 [inlined]
│ [12] _require(pkg::Base.PkgId, env::Nothing)
│ @ Base ./loading.jl:2044
│ [13] __require_prelocked(uuidkey::Base.PkgId, env::Nothing)
│ @ Base ./loading.jl:1886
│ [14] #invoke_in_world#3
│ @ ./essentials.jl:926 [inlined]
│ [15] invoke_in_world
│ @ ./essentials.jl:923 [inlined]
│ [16] _require_prelocked
│ @ ./loading.jl:1877 [inlined]
│ [17] _require_prelocked
│ @ ./loading.jl:1876 [inlined]
│ [18] run_extension_callbacks(extid::Base.ExtensionId)
│ @ Base ./loading.jl:1372
│ [19] run_extension_callbacks(pkgid::Base.PkgId)
│ @ Base ./loading.jl:1404
│ [20] run_package_callbacks(modkey::Base.PkgId)
│ @ Base ./loading.jl:1228
│ [21] _tryrequire_from_serialized(modkey::Base.PkgId, path::String, ocachepath::Nothing, sourcepath::String, depmods::Vector{Any})
│ @ Base ./loading.jl:1561
│ [22] _require_search_from_serialized(pkg::Base.PkgId, sourcepath::String, build_id::UInt128)
│ @ Base ./loading.jl:1648
│ [23] _require(pkg::Base.PkgId, env::String)
│ @ Base ./loading.jl:2012
│ [24] __require_prelocked(uuidkey::Base.PkgId, env::String)
│ @ Base ./loading.jl:1886
│ [25] #invoke_in_world#3
│ @ ./essentials.jl:926 [inlined]
│ [26] invoke_in_world
│ @ ./essentials.jl:923 [inlined]
│ [27] _require_prelocked(uuidkey::Base.PkgId, env::String)
│ @ Base ./loading.jl:1877
│ [28] macro expansion
│ @ ./loading.jl:1864 [inlined]
│ [29] macro expansion
│ @ ./lock.jl:270 [inlined]
│ [30] __require(into::Module, mod::Symbol)
│ @ Base ./loading.jl:1827
│ [31] #invoke_in_world#3
│ @ ./essentials.jl:926 [inlined]
│ [32] invoke_in_world
│ @ ./essentials.jl:923 [inlined]
│ [33] require(into::Module, mod::Symbol)
│ @ Base ./loading.jl:1820
│ [34] trixi_include(mapexpr::typeof(identity), mod::Module, elixir::String; enable_assignment_validation::Bool, replace_assignments_recursive::Bool, kwargs::@Kwargs{})
│ @ TrixiBase ~/.julia/packages/TrixiBase/MGeKl/src/trixi_include.jl:0
│ [35] trixi_include
│ @ ~/.julia/packages/TrixiBase/MGeKl/src/trixi_include.jl:49 [inlined]
│ [36] #trixi_include#6
│ @ ~/.julia/packages/TrixiBase/MGeKl/src/trixi_include.jl:86 [inlined]
│ [37] trixi_include
│ @ ~/.julia/packages/TrixiBase/MGeKl/src/trixi_include.jl:85 [inlined]
│ [38] (::Main.TestExamplesMPI.TestExamplesMPITreeMesh.TrixiTestModule.var"#2#4")()
│ @ Main.TestExamplesMPI.TestExamplesMPITreeMesh.TrixiTestModule ~/.julia/packages/TrixiTest/6CVoC/src/macros.jl:15
│ [39] (::Base.RedirectStdStream)(thunk::Main.TestExamplesMPI.TestExamplesMPITreeMesh.TrixiTestModule.var"#2#4", stream::IOStream)
│ @ Base ./stream.jl:1434
│ [40] #1
│ @ ~/.julia/packages/TrixiTest/6CVoC/src/macros.jl:14 [inlined]
│ [41] open(::Main.TestExamplesMPI.TestExamplesMPITreeMesh.TrixiTestModule.var"#1#3", ::String, ::Vararg{String}; kwargs::@Kwargs{})
│ @ Base ./io.jl:396
│ [42] open(::Function, ::String, ::String)
│ @ Base ./io.jl:393
│ [43] macro expansion
│ @ ~/.julia/packages/TrixiTest/6CVoC/src/macros.jl:13 [inlined]
│ [44] macro expansion
│ @ ~/work/Trixi.jl/Trixi.jl/test/test_trixi.jl:35 [inlined]
│ [45] macro expansion
│ @ ~/work/Trixi.jl/Trixi.jl/test/test_mpi_tree.jl:20 [inlined]
│ [46] macro expansion
│ @ ~/hostedtoolcache/julia/1.10.11/aarch64/share/julia/stdlib/v1.10/Test/src/Test.jl:1582 [inlined]
│ [47] macro expansion
│ @ ~/work/Trixi.jl/Trixi.jl/test/test_mpi_tree.jl:20 [inlined]
│ [48] top-level scope
│ @ ~/.julia/packages/TrixiTest/6CVoC/src/macros.jl:196
│ [49] eval(m::Module, e::Any)
│ @ Core ./boot.jl:385
│ [50] macro expansion
│ @ ~/.julia/packages/TrixiTest/6CVoC/src/macros.jl:172 [inlined]
│ [51] macro expansion
│ @ ~/work/Trixi.jl/Trixi.jl/test/test_mpi_tree.jl:19 [inlined]
│ [52] macro expansion
│ @ ~/hostedtoolcache/julia/1.10.11/aarch64/share/julia/stdlib/v1.10/Test/src/Test.jl:1582 [inlined]
│ --- the last 2 lines are repeated 1 more time ---
│ [55] top-level scope
│ @ ~/work/Trixi.jl/Trixi.jl/test/test_mpi_tree.jl:17
│ [56] include(mod::Module, _path::String)
│ @ Base ./Base.jl:495
│ [57] include(x::String)
│ @ Main.TestExamplesMPI ~/work/Trixi.jl/Trixi.jl/test/test_mpi.jl:1
│ [58] macro expansion
│ @ ~/work/Trixi.jl/Trixi.jl/test/test_mpi.jl:21 [inlined]
│ [59] macro expansion
│ @ ~/hostedtoolcache/julia/1.10.11/aarch64/share/julia/stdlib/v1.10/Test/src/Test.jl:1582 [inlined]
│ [60] top-level scope
│ @ ~/work/Trixi.jl/Trixi.jl/test/test_mpi.jl:21
│ [61] include(mod::Module, _path::String)
│ @ Base ./Base.jl:495
│ [62] exec_options(opts::Base.JLOptions)
│ @ Base ./client.jl:316
└ @ Base loading.jl:1378
Content of `stderr`:
[cli_1]: PMIU_write error; fd=10 buf=:cmd=init pmi_version=1 pmi_subversion=1
:
system msg for write_line failure : Bad file descriptor
[cli_1]: PMIU_write error; fd=10 buf=:cmd=abort exitcode=-1 message=PMI_Init failed
:
system msg for write_line failure : Bad file descriptor
[cli_1]: PMIU_write error; fd=10 buf=:cmd=get_appnum
:
system msg for write_line failure : Bad file descriptorAny idea what is causing this? |
During the precompilation of the package extension we call And we re-trigger precompilation because we use |
But shouldn't Lines 9 to 11 in 94cfb43 |
|
Precompilation is running in a new isolated process, but it sees the environment variables that the mpi job launcher sets up. So it won't see that the parent process has set that global (but process local) flag. (I am struggling to reproduce this locally on Archlinux so it might be something specific to how MPI on Mac works) |
Ah, makes sense. Do you have an idea how to fix it? |
Besides, looking with sad eyes at @sloede and complaining about Trixi initializing MPI, without user consent? |
|
Let's see if e55a4d2 helps with anything... |
|
Well at least it is now correctly failing instead of hanging: |
|
Of course, this time we didn't run the MPI test in the first place, and it is not ideal that the MPI CI job will be twice as long now, also it looks like some of these tests are not part of the normal CI? I can't find |
|
That seems to have done it. We could also run |
|
Looks good. But running the MPI tests twice is not good.
(see https://github.com/trixi-framework/Trixi.jl/actions/runs/25325340551/job/74244809681?pr=2910#step:7:20206) are something to worry about? These also appear before this PR though.
because when CI hangs, it looks like it already hangs for the first elixir it tries to run.
|
The hangs I reproduced locally all looked like #2990 The other ones seem to be more around the potential precompilation issue with the package extension which I did not manage to reproduce locally.
Yeah that feels like a precompilation/loading issue. We could also try a simple: "using Trixi, OrdinaryDiffEqXXX` to just defend against the precompilation issue |
Yeah, that's particularly weird since that PR didn't have the extension yet. We don't have the log anymore for the |
This reverts commit e55a4d2.
|
Hangs on Windows again |
As far as I was told it is nothing to worry about. And it will be gone with t8code v4. |
|
We also had again on aarch64: |
| if hasproperty(integrator, :qold) | ||
| attributes(file)["time_integrator_qold"] = integrator.qold | ||
| elseif hasproperty(controller, :errold) | ||
| attributes(file)["time_integrator_qold"] = controller.errold | ||
| elseif hasproperty(controller, :qold) | ||
| attributes(file)["time_integrator_qold"] = controller.qold | ||
| elseif hasproperty(controller, :dt_factor) | ||
| attributes(file)["time_integrator_qold"] = controller.dt_factor | ||
| end |
There was a problem hiding this comment.
Instead if checking for the individual fields, wouldnt it be better (prospectively) to have some store_controller! function, symmetric to the load function? Note that there are more controllers than PI and PID already implemented.
There was a problem hiding this comment.
Yes, I think so, too. I also just realized that in OrdinaryDiffEqCore.jl v3.32 and v3.33 the interface is even different to <v3.31, see #2938. So to also support these versions, we need more different case distinctions. I am working on a refactor including a store_controller function defined in the extension.
| controller.err[:] = read(attributes(file)["time_integrator_controller_err"]) | ||
| end | ||
|
|
||
| load_controller!(integrator, controller, file) |
There was a problem hiding this comment.
We just unpack the controller for the call, so we can encapsulate the "complexity" in the extension behind a load_controller!(integrator, file) call.
There was a problem hiding this comment.
But then we cannot dispatch on the controller (cache) type?
| # Trixi automatically initializes MPI, this causes issues if precompilation occurs under MPI. | ||
| # The below MPI test uses different compilation flags and thus we want to ensure that precompilation is done with the same flags. | ||
| run(`$(Base.julia_cmd()) --threads=1 --check-bounds=yes --heap-size-hint=0.5G -e "using Trixi, OrdinaryDiffEqCore"`) |
This pull request changes the compat entry for the
RecursiveArrayToolspackage from3.37to3.37, 4.This keeps the compat entries for earlier versions.
Note: I have not tested your package with this new compat entry.
It is your responsibility to make sure that your package tests pass before you merge this pull request.
Closes #2918, closes #2919, closes #2949, closes #2950, closes #2960, closes #2961, closes #2962, closes #2963, closes #2965, closes #2966, closes #2967, closes #2968, closes #2969, closes #2970, closes #2971, closes #2972, closes #2973, closes #2981, closes #2983, closes #2984.
The required changes are:
u_modified!byderivative_discontinuity!VectorOfArrayofSVectorsnow returns the underlying floating values, not theSVector, fixed by replacingubyparent(u)threadis now passed asFastBroadcast.Serial()/FastBroadcast.Threaded()instead ofTrue()/False().