Skip to content

Switching of MPI communicator#139

Merged
sanathkeshav merged 6 commits into
DataAnalyticsEngineering:developfrom
Snapex2409:switch-mpi-comm
May 7, 2026
Merged

Switching of MPI communicator#139
sanathkeshav merged 6 commits into
DataAnalyticsEngineering:developfrom
Snapex2409:switch-mpi-comm

Conversation

@Snapex2409

@Snapex2409 Snapex2409 commented May 6, 2026

Copy link
Copy Markdown
Contributor

This enables the selection between MPI_COMM_WORLD and MPI_COMM_SELF.
The primary use-case is for pyFANS within the Micro Manager to support both single- and multi-level parallelization (workers).
Without workers, pyFANS instances are distributed across Micro Manager ranks. However, pyFANS and the Micro Manager share the same MPI_COMM_WORLD communicator, leading to incorrect computation. For such cases, parallel computation must be deactivated within pyFANS.
With workers, this issue no longer persists, as workers are a separate process and, hence, have their own MPI_COMM_WORLD communicator.

Checklist:

  • I made sure that the CI passed before I ask for a review.
  • I added a summary of the changes (compared to the last release) in the CHANGELOG.md.
  • If necessary, I made changes to the documentation and/or added new content.
  • I will remember to squash-and-merge, providing a useful summary of the changes of this PR.

@Snapex2409 Snapex2409 marked this pull request as ready for review May 6, 2026 11:03
@IshaanDesai IshaanDesai requested a review from sanathkeshav May 6, 2026 11:24

@sanathkeshav sanathkeshav left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Snapex2409, thanks for the PR!

I agree with the communicator abstraction, but I do not think "no_mpi" belongs in the JSON input file. The JSON should describe the simulation problem, while the communicator is runtime/execution concern. As it is, it is potentially unsafe if "no_mpi": true and one runs mpirun -n 4 FANS inp.json result.h5.

If MPI_COMM_SELF functionality is really needed, I would prefer passing the communicator to the Reader constructor:

Reader reader(MPI_COMM_WORLD);  // or Reader reader(MPI_COMM_SELF);
reader.ReadInputFile(input_fn);

Then we can remove the "no_mpi" key entirely. With this design, I also do not see a need to split ReadInputFile() into ReadInputFile() and ReadJson().
This would make the PR much smaller.

So I would suggest simplifying this PR by removing "no_mpi", passing the communicator explicitly through the Reader constructor, and keeping ReadInputFile() as the single parsing function.

Please let me know if I overlooked/misunderstood something. Thanks a lot!

@Snapex2409

Snapex2409 commented May 6, 2026

Copy link
Copy Markdown
Contributor Author

If it is passed as an argument to the Reader, how should the communicator type be determined then?
Should it be fixed at compile time, as a CLI argument, or another option?

Based on how I understood your suggestion, it seemed to be fixed at compile time.
This is possible, but with the later PRs for pyFANS this will result in 2 builds per micro material type.

PS: Ideally, I require a method to dynamically change the MPI comm based on some config file for pyFANS.

@sanathkeshav

sanathkeshav commented May 6, 2026

Copy link
Copy Markdown
Member

Thanks, that makes sense. I think there is a small misunderstanding though.. passing the communicator to the Reader constructor would not make this a compile-time decision.
The communicator can still be selected dynamically, and then passed to Reader as

Reader reader(comm);
reader.ReadInputFile(input_fn);

where comm is determined by you before constructing the Reader.
For example, for the FANS executable, I would set MPI_COMM_WORLD as comm in src/main.cpp when creating the Reader object.

For pyFANS, you should decide this dynamically from your runtime config and then pass the selected communicator.

My main point is that the communicator choice should live one level above Reader::ReadInputFile(). The input JSON describes the simulation problem, while the driver/pyFANS describes how the problem is executed.

Please let me know again, if I overlooked/misunderstood something.

@Snapex2409

Copy link
Copy Markdown
Contributor Author

From a design perspective, I fully understand your point.

Regarding:

you should decide this dynamically from your runtime config...

In pyFANS, I cannot extend the constructor to accommodate that without requiring another API change for the Micro Manager. Since the input file was the only configurable component at run time, I implemented it there initially. Would it be acceptable for the meantime to load a separate pyFANS-config file?

@sanathkeshav

Copy link
Copy Markdown
Member

Thanks, I understand the constraint.

Regarding pyFANS, you and @IshaanDesai, of course, have complete and absolute liberty to design the runtime/API in whatever way that works best for you (this also applies to subsequent PRs from you).

So yes, a separate pyFANS config file sounds much better to me, at least as an intermediate solution. Then pyFANS can read its own runtime config, decide whether it wants MPI_COMM_WORLD, MPI_COMM_SELF, or something else, and pass the corresponding communicator to the Reader.

@IshaanDesai

Copy link
Copy Markdown
Collaborator

Having a separate config file for pyFANS would be fine in my opinion 👍

@sanathkeshav

Copy link
Copy Markdown
Member

@Snapex2409 , Thanks a lot! I am merging this now!

@sanathkeshav sanathkeshav merged commit ea32461 into DataAnalyticsEngineering:develop May 7, 2026
8 checks passed
@Snapex2409 Snapex2409 deleted the switch-mpi-comm branch May 7, 2026 08:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants