fix(index): drop tombstoned rows from stable row-id optimize filter#6969
Open
kaan-simbe wants to merge 1 commit into
Open
fix(index): drop tombstoned rows from stable row-id optimize filter#6969kaan-simbe wants to merge 1 commit into
kaan-simbe wants to merge 1 commit into
Conversation
A merge_insert UPDATE on a stable-row-id dataset keeps the updated row's
stable ID in the old fragment's row-id sequence (only the deletion vector
marks the physical row stale), while a fresh copy of that same stable ID
arrives in the unindexed delta. build_stable_row_id_filter built its retain
allow-list from the raw row-id sequences without applying deletion vectors,
so the stale entry survived optimize_indices alongside the freshly-indexed
copy. The merged BTREE page then held two entries with the same stable ID,
tripping RowAddrTreeMap::from_sorted_iter ("non-sorted input") on the next
index-using read (lance-index/src/scalar/btree/flat.rs:58).
Mask each retained fragment's deletion vector against its row-id sequence
before building the allow-list so updated/deleted stable IDs are excluded.
Adds a regression test covering merge_insert UPDATE + optimize_indices with
no intervening compaction.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Collaborator
|
@kaan-simbe There is a conflict file. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
optimize_indiceson a stable-row-id dataset with a BTREE scalar index corrupts the index after amerge_insertUPDATE (no intervening compaction). The next index-using read panics with:Root cause
With stable row IDs the BTREE stores stable IDs (not physical addresses), and
build_stable_row_id_filter(rust/lance/src/index/append.rs) builds the "retain old rows" allow-list from each retained fragment's row-id sequence — but it read the raw sequences without applying deletion vectors.A
merge_insertUPDATE keeps the updated row's stable ID in the old fragment's row-id sequence (the physical row is only tombstoned via the deletion vector) while a fresh copy of that same stable ID is written to a new fragment that lands in the unindexed delta. So during optimize:(old_value, S)— kept, becauseSis still in the allow-list(new_value, S)— added from the unindexed deltaBoth survive the merge, the merged BTREE page holds two rows with stable ID
S, andFlatIndex::try_newtrips its strictly-sorted invariant on the next read.Fix
Apply each retained fragment's deletion vector to its row-id sequence (
RowIdSequence::mask) before building the allow-list, so updated/deleted stable IDs are excluded. The allow-list only shrinks — no new data is materialized in memory, which avoids the memory hazard raised in the discussion on #6041.Relation to #6041
#6041 identified the same bug but stalled on its approach (materializing all new row IDs for the filter pass). This takes the opposite direction: prune the old side using deletion vectors that are already loaded per fragment.
Test plan
test_optimize_btree_after_merge_insert_update_with_stable_row_idsreproduces the panic without the fix and passes with it (stable row IDs → BTREE →merge_insertUPDATE →optimize_indices, no compaction → index-using reads return correct counts and each id resolves to exactly one row).test_optimize_btree_keeps_rows_with_stable_row_ids_after_compactionstill passes.cargo fmt --all --check,cargo clippy -p lance --tests -- -D warnings.