Skip to content

Mooncake 0.5.25 compat#988

Draft
sunxd3 wants to merge 8 commits intoJuliaDiff:mainfrom
sunxd3:sunxd/fix-mooncake-friendly-tangents
Draft

Mooncake 0.5.25 compat#988
sunxd3 wants to merge 8 commits intoJuliaDiff:mainfrom
sunxd3:sunxd/fix-mooncake-friendly-tangents

Conversation

@sunxd3
Copy link
Copy Markdown

@sunxd3 sunxd3 commented Apr 4, 2026

An attempt at addressing #986.

Feel free to make any edits or take over!

@sunxd3 sunxd3 requested a review from gdalle as a code owner April 4, 2026 14:47
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 4, 2026

Codecov Report

❌ Patch coverage is 90.47619% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 97.28%. Comparing base (a5ecbe0) to head (ce72baf).

Files with missing lines Patch % Lines
...e/ext/DifferentiationInterfaceMooncakeExt/utils.jl 81.81% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #988      +/-   ##
==========================================
- Coverage   98.21%   97.28%   -0.94%     
==========================================
  Files         135      131       -4     
  Lines        8000     7984      -16     
==========================================
- Hits         7857     7767      -90     
- Misses        143      217      +74     
Flag Coverage Δ
DI 97.86% <90.47%> (-1.12%) ⬇️
DIT 95.76% <ø> (-0.47%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sunxd3
Copy link
Copy Markdown
Author

sunxd3 commented Apr 4, 2026

I think I understand the CI error, there is something I need to patch on the Mooncake side, will come back to this

@gdalle gdalle marked this pull request as draft April 5, 2026 08:36
@sunxd3
Copy link
Copy Markdown
Author

sunxd3 commented Apr 9, 2026

chalk-lab/Mooncake.jl#1129 should unblock this PR, we'll release it as soon as it's available

@gdalle
Copy link
Copy Markdown
Member

gdalle commented Apr 10, 2026

Thank you for taking a crack at this! I'll wait until the tests pass before reviewing if that's okay

@sunxd3
Copy link
Copy Markdown
Author

sunxd3 commented Apr 10, 2026

totally fine!

sunxd3 added 5 commits April 10, 2026 08:12
Mooncake returns raw Tangent objects instead of friendly arrays for
StaticArrays on Julia 1.11. This is an upstream bug — skip the test
until it is fixed.
On Julia 1.11, Mooncake may return raw Tangent objects instead of
friendly arrays for StaticArrays even with friendly_tangents=true.
Add _maybe_to_primal dispatch as a safety net that converts leaked
Tangent objects to primal-shaped values, no-op otherwise.
Also convert leaked Mooncake.MutableTangent (e.g. MVector tangents)
and apply _maybe_to_primal in forward mode (pushforward) paths.
@sunxd3
Copy link
Copy Markdown
Author

sunxd3 commented Apr 10, 2026

@gdalle the Mooncake CIs are passing (it probably requires Mooncake v0.5.26).

Could you take over? I also won't be offended if you want to start a new PR.

@gdalle
Copy link
Copy Markdown
Member

gdalle commented Apr 10, 2026

I'll take a look when I can, thanks a bunch! DI's tests are failing on main too because of Mooncake's breaking release so this is a priority for me. Do you know why coverage is not complete?

@sunxd3
Copy link
Copy Markdown
Author

sunxd3 commented Apr 10, 2026

Thanks a lot.

Do you know why coverage is not complete?

I am not certain. A bad guess is that some code changes are more defensive than necessary. Sorry!

Copy link
Copy Markdown
Member

@gdalle gdalle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for trying to fix what others broke! I added a few remarks to understand the task a bit better, I'll wait for your answers

Comment on lines +30 to +38
@inline maybe_getfield(mod, name::Symbol) =
isdefined(mod, name) ? getfield(mod, name) : nothing

const mooncake_tangent_to_friendly = maybe_getfield(
Mooncake, Symbol("tangent_to_friendly!!")
)
const mooncake_friendly_tangent_cache = maybe_getfield(Mooncake, :FriendlyTangentCache)
const mooncake_as_primal = maybe_getfield(Mooncake, :AsPrimal)
const mooncake_no_cache = maybe_getfield(Mooncake, :NoCache)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to be very robust? I'd rather impose a lower bound for Mooncake at v0.5.25 in Project.toml (that way we're sure we can use all of these symbols)

)
y = first(y_and_dy)
dy = _copy_output(last(y_and_dy))
dy = _maybe_to_primal(last(y_and_dy), y)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to ensure that primal conversion happens here? If friendly_tangents is set to true, won't Mooncake's pushforward and pullback already return a primal-like object?

backend = AutoMooncake(; config = Mooncake.Config(; friendly_tangents = true))
inputs = (
Symmetric([2.0 1.0; 1.0 3.0]),
Hermitian(ComplexF64[2 1 + im; 1 - im 3]),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know which convention Mooncake uses for gradients of functions with complex inputs and real outputs? There are two possible choices, see e.g. https://arxiv.org/abs/2409.06752

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@testset "$(typeof(x))" for x in inputs
grad = gradient(f, backend, x)
y, grad2 = value_and_gradient(f, backend, x)
pb = only(pullback(identity, backend, x, (x,)))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a strong enough test, the function is too simple

@test grad isa Matrix
@test grad2 isa Matrix
@test pb isa Matrix
@test grad == grad2
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

grad and grad2 are never compared against the ground truth

!isnothing(mooncake_as_primal) &&
!isnothing(mooncake_no_cache)
dest = mooncake_friendly_tangent_cache{mooncake_as_primal}(_copy_output(x))
cache = isbitstype(typeof(x)) ? mooncake_no_cache() : IdDict{Any, Any}()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is a type-unstable dictionary needed here?
Does this make every tangent-to-primal conversion outside of bitstypes slow?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regardless of why, we may want to allocate this dictionary in the preparation phase

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup this will regress performance on hot paths. This should be allocated once during the prepare phase and stored in the extras cache.

@gdalle
Copy link
Copy Markdown
Member

gdalle commented Apr 16, 2026

@AstitvaAggarwal @Technici4n could you maybe take a look too?

Copy link
Copy Markdown
Contributor

@AstitvaAggarwal AstitvaAggarwal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also we might want to keep track of future possible tangent_types: _maybe_to_primal(x, _) = x will silently pass through any tangent type not yet accounted for (e.g. a future Mooncake.SparseTangent), making failures invisible.

!isnothing(mooncake_as_primal) &&
!isnothing(mooncake_no_cache)
dest = mooncake_friendly_tangent_cache{mooncake_as_primal}(_copy_output(x))
cache = isbitstype(typeof(x)) ? mooncake_no_cache() : IdDict{Any, Any}()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup this will regress performance on hot paths. This should be allocated once during the prepare phase and stored in the extras cache.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants