Skip to content

Regression tests for SingleMemoryStorageSchedule checkpoint clearing#5168

Open
sghelichkhani wants to merge 3 commits into
firedrakeproject:mainfrom
sghelichkhani:sghelichkhani/singlemem-long-range-test
Open

Regression tests for SingleMemoryStorageSchedule checkpoint clearing#5168
sghelichkhani wants to merge 3 commits into
firedrakeproject:mainfrom
sghelichkhani:sghelichkhani/singlemem-long-range-test

Conversation

@sghelichkhani

@sghelichkhani sghelichkhani commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Companion to dolfin-adjoint/pyadjoint#248, which fixes the SingleMemoryStorageSchedule checkpoint clearing reported in dolfin-adjoint/pyadjoint#211 and dolfin-adjoint/pyadjoint#260. Two regression tests, one per failure mode.

The first covers #211: a block variable reused only every third timestep is absent from the immediately preceding step's adjoint_dependencies, so the old condition cleared its checkpoint during forward replay and the reverse pass produced a wrong gradient (73765 instead of 3205 in the test). The test tapes ten such staggered timesteps and compares functional and gradient under SingleMemoryStorageSchedule against an unscheduled tape.

The second covers #260: a variable that is never redefined keeps the live block variable of its Function, and the clearing discarded its checkpoint at its last forward use even though the adjoint of that step still needs it as a linearisation point. The saved_output fallback to the live Function then supplies the taping-time value, so the gradient is wrong whenever the functional is re-evaluated at a new control first (140.5536 instead of 187.4048 in the test). The test runs two evaluate/derivative cycles at new controls, the second of which exercises the clearing that resumes once the adjoint dependencies have been revised by the first reverse pass.

In both tests the ordering is load-bearing and commented: the clearing only runs during the forward replay triggered by re-evaluating the reduced functional, so Jhat is called before derivative(). While test_burgers_newton parametrises over "SingleMemory", it never enables that schedule, so no SingleMemory validity comparison existed in the suite.

The TEMPORARY commit points pyadjoint-ad at the #248 branch so CI exercises the fix; it should be dropped, restoring the version pin, once #248 is merged and released. Since that branch sits on top of dolfin-adjoint/pyadjoint#257, CI will also show the pre-existing clear-down assertion failures that #5157 addresses, which are unrelated to these tests. As with the #257 companion, this should presumably end up on release once the ordering is settled.

…encies

A block variable reused only every third timestep is absent from the
immediately preceding step's adjoint_dependencies, so the checkpoint
clearing for SingleMemoryStorageSchedule discarded it during forward
replay and the reverse pass produced a wrong gradient (73765 instead of
3205 in this test). Companion to dolfin-adjoint/pyadjoint#248, which
fixes the clearing condition; see dolfin-adjoint/pyadjoint#211 for the
original report.
Drop this commit (restoring the pyadjoint-ad version pin) once
dolfin-adjoint/pyadjoint#248 is merged and released.
A variable that is never redefined keeps the live block variable of its
Function, and the clearing for SingleMemoryStorageSchedule discarded its
checkpoint at its last forward use even though the adjoint of that step
still needs it as a linearisation point. The saved_output fallback to
the live Function then supplies the taping-time value, corrupting the
gradient whenever the functional is re-evaluated at a new control first
(140.5536 instead of 187.4048 in this test). Companion to
dolfin-adjoint/pyadjoint#248; see dolfin-adjoint/pyadjoint#260 for the
report.
@sghelichkhani sghelichkhani changed the title Test that long-range dependencies survive SingleMemoryStorageSchedule clearing Regression tests for SingleMemoryStorageSchedule checkpoint clearing Jun 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant