Skip to content

Do not merge: working PR for NASA crew#2364

Draft
romanc wants to merge 16 commits into
spcl:mainfrom
romanc:romanc/math-functions
Draft

Do not merge: working PR for NASA crew#2364
romanc wants to merge 16 commits into
spcl:mainfrom
romanc:romanc/math-functions

Conversation

@romanc

@romanc romanc commented May 11, 2026

Copy link
Copy Markdown
Contributor

Description

dace/math.h re-exposes a couple of math functions from cmath, e.g. std::sin(x), which has overloads for e.g. float and double. Some functions, e.g. asin(x), were not exposed in this way. This leads to precision issues when asin(x) is called with a float because in the generated code, asin will be mapped to the C version, which is only defined for doubles. There is thus an implicit cast happening of the argument and the computation is done in double precision. Worse, the return type is always a double, which will force the rest of the calculation to be up-casted to double precision if asin() is used in an expression.

TODO list for later

  • make test case for math functions (e.g. compare against std::log10(x) in a cpp-tasklet)
  • make test case for symbol replacement (double-replace with strings, replace symbol with non-default dtype)
  • make test case for fix: stree -> sdfg, data descriptors of nested maps
  • make test cases for memlets_in_ast() (subscript access, scalar access, indirect array access)
  • add support for indirect array access in memlets_in_ast()
  • once memlets_in_ast() is reliable, we can inline conditionals and loop condition/update (in stree as well as in gt4py bridge)
  • refactor tn.IfScope / tn.ElifScope / tn.ElseScope to tn.ConditionalScope
  • validation succeeds if conditional uses a scalar that isn't defined (anymore); see image below; might be fixed by considering scalar-memlets (to be re-tested)
  • validation succeeds if mapped tasklets aren't fully connected; see image below
image image

@romanc romanc marked this pull request as ready for review May 11, 2026 13:26

@alexnick83 alexnick83 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine. I would add a test that demonstrates the issue described in the PR (up/downcasting of data) and confirms that it is now resolved. I guess just one test for, e.g., asin would be enough.

@romanc

romanc commented May 11, 2026

Copy link
Copy Markdown
Contributor Author

Make sense, I'll add something tomorrow. I think I can build upon compare_numpy_output().

@romanc romanc marked this pull request as draft May 19, 2026 10:28
@romanc

romanc commented May 19, 2026

Copy link
Copy Markdown
Contributor Author

putting this back into a draft state because it is now our de-facto working branch and I've started to push more unrelated changes. I'll pull this apart later and make nice PRs with unrelated changes and tests.

@romanc romanc changed the title Expose more math functions in math.h Do not merge: workig PR for NASA crew May 20, 2026
@tbennun tbennun changed the title Do not merge: workig PR for NASA crew Do not merge: working PR for NASA crew May 22, 2026
romanc added 7 commits June 1, 2026 16:21
 Please enter the commit message for your changes. Lines starting
Names/Labels of LoopRegions need to be unique inside the CFG (validation
checks that). So far, we didn't have problems. However, with "no
simplify before stree", we are getting in places where this is an issue.

This commit deploys a simple fix to ensure unique names of LoopRegions.
Every tree node has functions to calculate input and output memlets (of
that node). Tree scopes have a default implementation to gather all
inputs/outputs of their children. That default implementation for scopes
didn't consider read after write (within the same scope). This caused "too
many inputs" to be returned, which - in turn - caused the dependency
analysis of the state boundary inserter to generate wrong results. This
could lead to **missing** state boundaries. We saw write/write races in
D2A2C_Vect of pyFV3.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants