Skip to content

Optimization Cycle I: Loop merge & transient refine#465

Draft
FlorianDeconinck wants to merge 115 commits into
NOAA-GFDL:developfrom
FlorianDeconinck:opt_cycle_I/loop_merge
Draft

Optimization Cycle I: Loop merge & transient refine#465
FlorianDeconinck wants to merge 115 commits into
NOAA-GFDL:developfrom
FlorianDeconinck:opt_cycle_I/loop_merge

Conversation

@FlorianDeconinck

@FlorianDeconinck FlorianDeconinck commented May 14, 2026

Copy link
Copy Markdown
Collaborator

Description

Readying for mainline the following Schedule Tree transform:

  • CartesianMerge
  • InlineVertical2DWrite
  • CartesianRefineTransients

QOL / Tooling:

  • TreeOptimizationStatistics will record the before/after count of maps, fors and transients

🐞 Regression/ Bugs worked around

  • Locals are now non-transient in GPU because of bugs showing during tree optimization
  • CartesianRefineTransients is not applied on GPU - same as above

⚠️ This PR includes an update to temporary branches of gt4py/dace to consolidate all changes needed for the June presentation

How has this been tested?

New tests when needed

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation (e.g. add new modules to docs/docstrings/)
  • My changes generate no new warnings
  • Any dependent changes have been merged and published in downstream modules
  • New check tests, if applicable, are included

@FlorianDeconinck FlorianDeconinck requested review from romanc and twicki May 14, 2026 16:06
FlorianDeconinck and others added 26 commits May 14, 2026 15:28
Move pipeline defaults inside the Pipeline itself and have orchestration call default
Mockup of passes required for merging to behave
Use symbols in the replacement directory. Update DaCe to a version
that doesn't re-initialize the symbols. And fix the test failure in
python 3.13.
This has been replaced with `InlineOffgridConditionals` pass
- Local are no longer transient on GPU
- RefineTransients is deactivated
romanc and others added 18 commits June 23, 2026 11:39
Because we have a "dace/" directory in `ndsl/dsl`, the previous import
could be resolved as a local import. If that happened (depending on
import order), then the DaCe's `Config` object would not be found there.
Resolved by importing `Config` from `dace.config`, which is unambiguous
to resolve.
The default merging order for `CartesianMerge` is to follow the loop
order of the given backend. This commit adds support for a custom merge
oder override.
Revert "fixup: use normalized indices in debug message"

This reverts commit b24e1fc.

Revert "fix: account for map start in axis normalization"

This reverts commit de9763d.
romanc added 5 commits July 1, 2026 09:57
When we normalize cartesian indices, add support for plain numbers as
indices. Previously, we'd assume that each index is a symbolic
expression. Now we have support for plain numbers too (e.g. from
transient refinement).
This was causing issues because one symbol was contained in the other,
e.g. when replacing `__k` with `__k_123` you'd get things like
`__k_123_456` in case `__k` and `__k_456` were mixed.
romanc added 4 commits July 1, 2026 15:15
This is working to the best of my knowledge. Test case is a `D_SW` in
`orch:dace:cpu:IJK` with FvTp2d configured to have loops ordered as
`K-J-I` i.e. the other way round, which allows to merge K-loops.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants