Skip to content

fix(search): use total_cmp for NaN-safe float ordering (#667)#778

Merged
mosuka merged 1 commit into
mainfrom
fix/667-nan-safe-total-cmp
Jun 4, 2026
Merged

fix(search): use total_cmp for NaN-safe float ordering (#667)#778
mosuka merged 1 commit into
mainfrom
fix/667-nan-safe-total-cmp

Conversation

@mosuka
Copy link
Copy Markdown
Owner

@mosuka mosuka commented Jun 4, 2026

Summary

Search, scoring, ranking, and segment-maintenance code ordered floats with partial_cmp(...).unwrap_or(Ordering::Equal) (and a few partial_cmp(...).unwrap()). That makes the comparator non-total: a NaN compares Equal to everything, which sort_unstable_by / BinaryHeap forbid (silent reorder, or a panic on recent std). The .unwrap() variants panic on NaN outright — e.g. deletion_ratio is 0/0 = NaN when total_docs == 0, inside a background merge sort.

This replaces all NaN-unsafe float comparators with f32::total_cmp / f64::total_cmp (a real IEEE-754 total order) — 70 sites across 22 files.

Approach

A balanced-paren transform rewrote RECEIVER.partial_cmp(ARG).unwrap_or(...Equal) and RECEIVER.partial_cmp(ARG).unwrap() to RECEIVER.total_cmp(ARG), never reordering ARG or the a/b operands, so every sort's direction and tie-break (.then_with(...), the geo lat→lon two-stage compare) is preserved structurally. total_cmp is an inherent method on f32/f64, so any non-float receiver would have failed to compile — all 70 receivers were float. Three now-unused Ordering imports were removed. bkd_tree.rs already used total_cmp as precedent.

For non-NaN values total_cmp matches partial_cmp exactly, so behaviour is unchanged except a NaN is now ordered deterministically instead of reordering or panicking. Public API is unchanged.

The HNSW Candidate (min-heap) / ResultCandidate (max-heap) Ord::cmp impls — the "float heaps" the issue names — are covered, along with the collector sort-key structs, the FieldValue::Float64 sort path, and the segment-maintenance priority / deletion_ratio sorts.

Tests

New nan_ordering_tests in hnsw/searcher.rs: pushing a NaN distance into the Candidate / ResultCandidate heaps must not panic, must not drop any element, and the finite distances must still pop nearest-first / furthest-first.

Verification

  • cargo clippy -p laurus --all-targets -- -D warnings — clean
  • cargo fmt --check — clean
  • cargo test -p laurus --lib1110 passed (+2)
  • cargo test -p laurus --tests (integration) — all 0 failed (no sort-order regression)

Out of scope (follow-up)

  • Type-level enforcement via an OrderedFloat wrapper for float BinaryHeaps, to prevent the footgun from recurring.

Closes #667

Search, scoring, ranking, and segment-maintenance code ordered floats
with partial_cmp(...).unwrap_or(Ordering::Equal) (and a few .unwrap()).
That makes the comparator non-total: a NaN compares Equal to everything,
which sort_unstable_by / BinaryHeap forbid (silent reorder, or a panic on
recent std), and the .unwrap() sites panic on NaN outright (e.g.
deletion_ratio is 0/0 = NaN when total_docs == 0, in a background merge).

Replace all NaN-unsafe float comparators with f32/f64::total_cmp (a real
IEEE-754 total order) across 70 sites in 22 files, preserving each sort's
direction and tie-break structure. For non-NaN values total_cmp matches
partial_cmp exactly, so behaviour is unchanged except NaN is now ordered
deterministically instead of reordering or panicking. The HNSW Candidate
(min-heap) / ResultCandidate (max-heap) Ord impls — the "float heaps" the
issue names — are covered; new tests assert the heaps handle a NaN
distance without panic or loss.

Closes #667
@mosuka mosuka merged commit 9c3a68e into main Jun 4, 2026
22 checks passed
@mosuka mosuka deleted the fix/667-nan-safe-total-cmp branch June 4, 2026 05:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

perf(vector/search): naN handling in float heaps via unwrap_or(Ordering::Equal) — silent reorder risk

1 participant