Prevent init_mpi from being automatically called during precompilation.#2993
Prevent init_mpi from being automatically called during precompilation.#2993
Conversation
Ideally we would avoid all precompilation under MPI, and that is our recommendation to users. Yet, we have scene the situation in CI, where an extension of Trixi gets recompiled due to a difference in flags, leading to crashes and hangs.
Review checklistThis checklist is meant to assist creators of PRs (to let them know what reviewers will typically look for) and reviewers (to guide them in a structured review process). Items do not need to be checked explicitly for a PR to be eligible for merging. Purpose and scope
Code quality
Documentation
Testing
Performance
Verification
Created with ❤️ by the Trixi.jl community. |
sloede
left a comment
There was a problem hiding this comment.
:jl_generating_output is essentially for free, right? Meaning, this will not cost us anything in terms of I/O, compute time etc. that would be annoying when done on 10k ranks in parallel.
Also, is a (reasonable) situation possible where only a subset of ranks might trigger "precompiling = yes", subsequently causing hangs because global (in the MPI sense) init operations are not executed on all ranks? Or would these situations already be causes for other types of crashes?
In practice, crashing is not so bad (just annoying) - really bad is 10k ranks job running for 12 hours and being stuck in initialization...
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2993 +/- ##
=======================================
Coverage 97.13% 97.13%
=======================================
Files 625 625
Lines 48514 48516 +2
=======================================
+ Hits 47122 47124 +2
Misses 1392 1392
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Yeah it's a cheap check.
That's not possible. The point is we want to initialize MPI only during the actual run not in the precompilation process that we happened to create.
In practice, the current state this lead to hangs, it may lead to crashes. On CI we have seen it cause hangs on Ubuntu and Windows and only on Macos we got a crash. |
Ideally we would avoid all precompilation under MPI, and that is our
recommendation to users. Yet, we have scene the situation in CI, where
an extension of Trixi gets recompiled due to a difference in flags,
leading to crashes and hangs.
See #2910 (comment) for an example
of that.