Skip to content

[Spark][CherryPick][3.3] Fix metadata cleanup by retaining files required for log reconstruction#7056

Open
dmattos-sap wants to merge 1 commit into
delta-io:branch-3.3from
dmattos-sap:cherry-pick-4146-branch-3.3
Open

[Spark][CherryPick][3.3] Fix metadata cleanup by retaining files required for log reconstruction#7056
dmattos-sap wants to merge 1 commit into
delta-io:branch-3.3from
dmattos-sap:cherry-pick-4146-branch-3.3

Conversation

@dmattos-sap

@dmattos-sap dmattos-sap commented Jun 19, 2026

Copy link
Copy Markdown

Which Delta project/connector is this regarding?

  • Spark
  • Standalone
  • Flink
  • Kernel
  • Other (fill in here)

Description

Delete eligible delta log files only if there's a checkpoint newer than them before the cutoff window.

Resolves #606 and #6718

How was this patch tested?

Unit tests based on #2673

Does this PR introduce any user-facing changes?

Yes, tables with low rate of commit/checkpoints will have increased log retention beyond the cutoff window


(cherry picked from commit c4713ce)

…nstruction (delta-io#4146)

- [X] Spark
- [ ] Standalone
- [ ] Flink
- [ ] Kernel
- [ ] Other (fill in here)

Delete eligible delta log files only if there's a checkpoint newer than
them before the cutoff window.

Resolves delta-io#606

Unit tests based on delta-io#2673

Unit Tests

Yes, tables with low rate of commit/checkpoints will have increased log
retention beyond the cutoff window

---------

Signed-off-by: Felipe Pessoto <fepessot@microsoft.com>
(cherry picked from commit c4713ce)
Signed-off-by: Daniel Mattos <d.mattos@sap.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants