fix: apply fragment-bitmap allow-list to stable-row-id deletion mask#6965
Open
ragnorc wants to merge 3 commits into
Open
fix: apply fragment-bitmap allow-list to stable-row-id deletion mask#6965ragnorc wants to merge 3 commits into
ragnorc wants to merge 3 commits into
Conversation
`do_create_deletion_mask_row_id` previously iterated every fragment in the dataset, so the resulting allow-list included stable row ids whose *current* physical home was outside the index's `fragment_bitmap`. On stable-row-id datasets this bypasses the allow-list added in lance-format#6563: `MapIndexExec` requests `create_restricted_deletion_mask`, but the stable-row-id branch at `prefilter.rs:270` ignores the restriction. Effect on merge_insert: after a `merge_insert UpdateAll` rewrites a row into a new fragment, a second merge_insert against the same key sees that row twice — once via the BTREE (which holds the row's stable row id and resolves to its new fragment through TakeExec) and once via the unindexed-fragments scan that also covers the new fragment. Both branches emit the same `_rowid`, so the source-dedup HashSet trips and the merge fails with "Ambiguous merge inserts are prohibited". Effect on FTS: a `full_text_search` after a merge_insert returns each moved row twice, because the inverted index's hits go through the same prefilter and the post-filter scan picks up the new fragment. Fix: thread `restrict_to_fragments` through `do_create_deletion_mask_row_id` so it only iterates fragments in the bitmap. The resulting allow-list now means "stable row ids whose current home is inside the restriction" — semantically equivalent to the non-stable-row-id branch. The cache key gains an optional bitmap hash so two consumers requesting different fragment subsets do not poison each other's cached mask. Closes lance-format#6877.
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #6877.
do_create_deletion_mask_row_idpreviously iterated every fragment in the dataset, so the resulting allow-list included stable row ids whose current physical home was outside the index'sfragment_bitmap. On stable-row-id datasets this bypassed the allow-list added in #6563:MapIndexExecrequestscreate_restricted_deletion_mask, but the stable-row-id branch atprefilter.rs:270ignored the restriction.Effect on merge_insert
After a
merge_insert UpdateAllrewrites a row into a new fragment, a second merge_insert against the same key saw that row twice — once via the BTREE (which holds the row's stable row id and resolves to its new fragment throughTakeExec) and once via the unindexed-fragments scan that also covered the new fragment. Both branches emitted the same_rowid, tripping the source-dedup HashSet:Effect on FTS
A
full_text_searchafter amerge_insertreturned each moved row twice, because the inverted index's hits went through the same prefilter and the post-filter scan picked up the new fragment. The new regression testtest_issue_6877_fts_no_duplicates_stable_row_idsempirically fails with "30 unique out of 40 total" before this fix.Fix
prefilter.rs: threadrestrict_to: Option<RoaringBitmap>throughdo_create_deletion_mask_row_id. When set, only iterate fragments in the bitmap. The resulting allow-list now means "stable row ids whose current home is inside the restriction" — semantically equivalent to the non-stable-row-id branch.prefilter.rs:270: passSome(fragments)whenrestrict_to_fragments=true,Noneotherwise.caches.rs: addrestrict_hash: Option<u64>toRowAddrMaskKeyso two consumers asking for different fragment subsets don't poison each other's cached mask.Mechanism, validated
Instrumentation on the unfixed code shows the BTREE returning stable row id
4for the matched row, the deletion-mask allow-list containing4(because the new unindexed fragment was iterated globally),mask & 4 = {4}passing the filter, both branches emitting_rowid=4, dedup HashSet tripping. After the fix,mask & 4 = {}correctly excludes the BTREE side and only the unindexed-fragment scan delivers the row.Test plan
cargo test -p lance --lib prefilter— 18/18 pass (17 existing + 1 new).cargo test -p lance --lib merge_insert— 145/145 pass (143 existing + 2 new).cargo fmt --allcargo clippy --all --tests --benches -- -D warningsNew regression tests:
test_restricted_deletion_mask_stable_row_id_honors_bitmap— pins the prefilter contract: full / restricted-subset / empty-bitmap cases.test_issue_6877_repeated_merge_insert_stable_row_ids— mirrors the issue's exact repro: two sequential merge_insert UpdateAll against the same key on a stable-row-id dataset with a BTREE scalar index.test_issue_6877_fts_no_duplicates_stable_row_ids— companion FTS regression: an inverted index + merge_insert sequence that previously returned duplicate hits.Notes
RowAddrMaskKeyusesDefaultHasherover the bitmap's sorted u32s. The cache is process-local in-memory, so cross-version stability isn't required.