Cooperative memory reclaim via async MemoryReclaimer#22043
Draft
JanKaul wants to merge 4 commits intoapache:mainfrom
Draft
Cooperative memory reclaim via async MemoryReclaimer#22043JanKaul wants to merge 4 commits intoapache:mainfrom
MemoryReclaimer#22043JanKaul wants to merge 4 commits intoapache:mainfrom
Conversation
This was referenced May 6, 2026
|
Thank you for opening this pull request! Reviewer note: cargo-semver-checks reported the current version number is not SemVer-compatible with the changes in this pull request (compared against the base branch). Details |
Contributor
Author
|
If I use this branch to query a larger-than-memory dataset, I get: With vanilla datafusion the ExternalSorter fails to allocate memory. So it looks like it solves the memory reclaiming issue for a single operator. However, now the next operator ExternalSorterMerge fails. So this solution doesn't handle cross operator reclamations. I think we need a hierarchical design with a MemoryPool and Reclaimer tree such that we have full control. I think the Velox design would be really great. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds an async hook that lets a
MemoryPoolask other consumers to free memory before failing an allocation.This PR is complementary to #21425 — it is not a replacement. It exists to broaden the design discussion there with a concrete alternative, not to supersede that work.
Design
trait MemoryReclaimer(async) attached to aMemoryConsumerviawith_reclaimer. Implements:reclaim(target), optionalreclaimable_bytes, optionalpriority.MemoryPool::try_grow_async— default delegates to synctry_grow.TrackConsumersPooloverrides it to walk registered reclaimers (priority desc, size desc) on OOM, retry the grow aftereach, then fall through to
inner.try_grow_asyncso a wrapped reclaim-aware pool isn't shadowed.SortExec): a channel-basedExternalSorterReclaimerhands a oneshot to the partition's stream loop;tokio::select! biased { reclaim_rx.recv() … ; input.next() … }spills end-to-end before replying with the freed-byte count. The stream loop is the sole owner of the sorter's batches, so the spill is ordered before the report — the bytes the pool sees are bytes already on disk.How this differs from #21425
try_grow_asyncinstead of syncpool.reclaim(...)— matches the channel hand-off pattern needed for cooperative spill inside DataFusion's async execution.SortExecto demonstrate the full flow.