Skip to content

fix: apply fragment-bitmap allow-list to stable-row-id deletion mask#6965

Open
ragnorc wants to merge 3 commits into
lance-format:mainfrom
ragnorc:ragnorc/verify-issue-6877-in-6.0.0
Open

fix: apply fragment-bitmap allow-list to stable-row-id deletion mask#6965
ragnorc wants to merge 3 commits into
lance-format:mainfrom
ragnorc:ragnorc/verify-issue-6877-in-6.0.0

Conversation

@ragnorc
Copy link
Copy Markdown
Contributor

@ragnorc ragnorc commented May 27, 2026

Summary

Closes #6877.

do_create_deletion_mask_row_id previously iterated every fragment in the dataset, so the resulting allow-list included stable row ids whose current physical home was outside the index's fragment_bitmap. On stable-row-id datasets this bypassed the allow-list added in #6563: MapIndexExec requests create_restricted_deletion_mask, but the stable-row-id branch at prefilter.rs:270 ignored the restriction.

Effect on merge_insert

After a merge_insert UpdateAll rewrites a row into a new fragment, a second merge_insert against the same key saw that row twice — once via the BTREE (which holds the row's stable row id and resolves to its new fragment through TakeExec) and once via the unindexed-fragments scan that also covered the new fragment. Both branches emitted the same _rowid, tripping the source-dedup HashSet:

Invalid user input: Ambiguous merge inserts are prohibited:
multiple source rows match the same target row on (id = "A").

Effect on FTS

A full_text_search after a merge_insert returned each moved row twice, because the inverted index's hits went through the same prefilter and the post-filter scan picked up the new fragment. The new regression test test_issue_6877_fts_no_duplicates_stable_row_ids empirically fails with "30 unique out of 40 total" before this fix.

Fix

  • prefilter.rs: thread restrict_to: Option<RoaringBitmap> through do_create_deletion_mask_row_id. When set, only iterate fragments in the bitmap. The resulting allow-list now means "stable row ids whose current home is inside the restriction" — semantically equivalent to the non-stable-row-id branch.
  • prefilter.rs:270: pass Some(fragments) when restrict_to_fragments=true, None otherwise.
  • caches.rs: add restrict_hash: Option<u64> to RowAddrMaskKey so two consumers asking for different fragment subsets don't poison each other's cached mask.

Mechanism, validated

Instrumentation on the unfixed code shows the BTREE returning stable row id 4 for the matched row, the deletion-mask allow-list containing 4 (because the new unindexed fragment was iterated globally), mask & 4 = {4} passing the filter, both branches emitting _rowid=4, dedup HashSet tripping. After the fix, mask & 4 = {} correctly excludes the BTREE side and only the unindexed-fragment scan delivers the row.

Test plan

  • Reproduces pre-fix: stashed the prefilter + cache changes, re-ran the two new regression tests — both FAIL with the documented errors.
  • Fixed post-fix: restored the stash, both tests pass.
  • cargo test -p lance --lib prefilter — 18/18 pass (17 existing + 1 new).
  • cargo test -p lance --lib merge_insert — 145/145 pass (143 existing + 2 new).
  • cargo fmt --all
  • cargo clippy --all --tests --benches -- -D warnings

New regression tests:

  • test_restricted_deletion_mask_stable_row_id_honors_bitmap — pins the prefilter contract: full / restricted-subset / empty-bitmap cases.
  • test_issue_6877_repeated_merge_insert_stable_row_ids — mirrors the issue's exact repro: two sequential merge_insert UpdateAll against the same key on a stable-row-id dataset with a BTREE scalar index.
  • test_issue_6877_fts_no_duplicates_stable_row_ids — companion FTS regression: an inverted index + merge_insert sequence that previously returned duplicate hits.

Notes

  • The hash on RowAddrMaskKey uses DefaultHasher over the bitmap's sorted u32s. The cache is process-local in-memory, so cross-version stability isn't required.

`do_create_deletion_mask_row_id` previously iterated every fragment in
the dataset, so the resulting allow-list included stable row ids whose
*current* physical home was outside the index's `fragment_bitmap`. On
stable-row-id datasets this bypasses the allow-list added in lance-format#6563:
`MapIndexExec` requests `create_restricted_deletion_mask`, but the
stable-row-id branch at `prefilter.rs:270` ignores the restriction.

Effect on merge_insert: after a `merge_insert UpdateAll` rewrites a row
into a new fragment, a second merge_insert against the same key sees
that row twice — once via the BTREE (which holds the row's stable
row id and resolves to its new fragment through TakeExec) and once via
the unindexed-fragments scan that also covers the new fragment. Both
branches emit the same `_rowid`, so the source-dedup HashSet trips and
the merge fails with "Ambiguous merge inserts are prohibited".

Effect on FTS: a `full_text_search` after a merge_insert returns each
moved row twice, because the inverted index's hits go through the same
prefilter and the post-filter scan picks up the new fragment.

Fix: thread `restrict_to_fragments` through
`do_create_deletion_mask_row_id` so it only iterates fragments in the
bitmap. The resulting allow-list now means "stable row ids whose
current home is inside the restriction" — semantically equivalent to
the non-stable-row-id branch.

The cache key gains an optional bitmap hash so two consumers requesting
different fragment subsets do not poison each other's cached mask.

Closes lance-format#6877.
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@github-actions github-actions Bot added the bug Something isn't working label May 27, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 27, 2026

Codecov Report

❌ Patch coverage is 97.91667% with 5 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance/src/dataset/write/merge_insert.rs 97.77% 2 Missing and 2 partials ⚠️
rust/lance/src/index/prefilter.rs 98.24% 0 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MergeInsertBuilder produces spurious "Ambiguous merge inserts" against rows previously rewritten by merge_insert

1 participant