fix(search): use total_cmp for NaN-safe float ordering (#667)#778
Merged
Conversation
Search, scoring, ranking, and segment-maintenance code ordered floats with partial_cmp(...).unwrap_or(Ordering::Equal) (and a few .unwrap()). That makes the comparator non-total: a NaN compares Equal to everything, which sort_unstable_by / BinaryHeap forbid (silent reorder, or a panic on recent std), and the .unwrap() sites panic on NaN outright (e.g. deletion_ratio is 0/0 = NaN when total_docs == 0, in a background merge). Replace all NaN-unsafe float comparators with f32/f64::total_cmp (a real IEEE-754 total order) across 70 sites in 22 files, preserving each sort's direction and tie-break structure. For non-NaN values total_cmp matches partial_cmp exactly, so behaviour is unchanged except NaN is now ordered deterministically instead of reordering or panicking. The HNSW Candidate (min-heap) / ResultCandidate (max-heap) Ord impls — the "float heaps" the issue names — are covered; new tests assert the heaps handle a NaN distance without panic or loss. Closes #667
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Search, scoring, ranking, and segment-maintenance code ordered floats with
partial_cmp(...).unwrap_or(Ordering::Equal)(and a fewpartial_cmp(...).unwrap()). That makes the comparator non-total: aNaNcomparesEqualto everything, whichsort_unstable_by/BinaryHeapforbid (silent reorder, or a panic on recent std). The.unwrap()variants panic onNaNoutright — e.g.deletion_ratiois0/0 = NaNwhentotal_docs == 0, inside a background merge sort.This replaces all NaN-unsafe float comparators with
f32::total_cmp/f64::total_cmp(a real IEEE-754 total order) — 70 sites across 22 files.Approach
A balanced-paren transform rewrote
RECEIVER.partial_cmp(ARG).unwrap_or(...Equal)andRECEIVER.partial_cmp(ARG).unwrap()toRECEIVER.total_cmp(ARG), never reorderingARGor thea/boperands, so every sort's direction and tie-break (.then_with(...), the geo lat→lon two-stage compare) is preserved structurally.total_cmpis an inherent method onf32/f64, so any non-float receiver would have failed to compile — all 70 receivers were float. Three now-unusedOrderingimports were removed.bkd_tree.rsalready usedtotal_cmpas precedent.For non-NaN values
total_cmpmatchespartial_cmpexactly, so behaviour is unchanged except aNaNis now ordered deterministically instead of reordering or panicking. Public API is unchanged.The HNSW
Candidate(min-heap) /ResultCandidate(max-heap)Ord::cmpimpls — the "float heaps" the issue names — are covered, along with the collector sort-key structs, theFieldValue::Float64sort path, and the segment-maintenancepriority/deletion_ratiosorts.Tests
New
nan_ordering_testsinhnsw/searcher.rs: pushing aNaNdistance into theCandidate/ResultCandidateheaps must not panic, must not drop any element, and the finite distances must still pop nearest-first / furthest-first.Verification
cargo clippy -p laurus --all-targets -- -D warnings— cleancargo fmt --check— cleancargo test -p laurus --lib— 1110 passed (+2)cargo test -p laurus --tests(integration) — all 0 failed (no sort-order regression)Out of scope (follow-up)
OrderedFloatwrapper for floatBinaryHeaps, to prevent the footgun from recurring.Closes #667