Skip to content

fix(vector-core): harden cosine similarity — input validation, zero-magnitude handling, AVX2 parity#1

Draft
Copilot wants to merge 4 commits intomainfrom
copilot/analyze-codebase-and-identify-weaknesses
Draft

fix(vector-core): harden cosine similarity — input validation, zero-magnitude handling, AVX2 parity#1
Copilot wants to merge 4 commits intomainfrom
copilot/analyze-codebase-and-identify-weaknesses

Conversation

Copy link
Copy Markdown

Copilot AI commented Mar 7, 2026

Description

cosine_similarity silently returned Ok(-1.0) for zero-magnitude vectors instead of erroring, and the AVX2 path had no validation at all (wrong return type, no magnitude/NaN guards). Empty vectors could be inserted into the HNSW index, propagating NaN through downstream distance calculations.

vector_distance.rs

  • Empty vectors → Err(InvalidVectorData) (was: proceed to divide-by-zero)
  • Zero/near-zero magnitude → Err(InvalidVectorData) via f64::EPSILON threshold (was: Ok(-1.0))
  • NaN/Infinity guard on computed similarity
  • Removed stale println!("mis-match in vector dimensions!")
  • cosine_similarity_avx2: changed return type f64Result<f64, VectorError>, added same magnitude + NaN guards as scalar path
// Before: zero-magnitude silently returns misleading value
cosine_similarity(&[0.0, 0.0], &[1.0, 2.0]) // => Ok(-1.0)

// After: properly errors
cosine_similarity(&[0.0, 0.0], &[1.0, 2.0]) // => Err(InvalidVectorData)

vector_core.rs

  • Empty vector validation in insert() and search() entry points
  • Fixed get_all_vectors key length check: prefix + 16prefix + 16 + 8 (id + level)
  • filter.is_none() || filter.unwrap()...filter.as_ref().map_or(true, ...)
  • Fixed typo "emtpy""empty" (2 occurrences)
  • Removed dead commented-out code in select_neighbors

Tests — 12 new edge case tests

  • Zero, near-zero, empty, both-empty, one-empty, dimension mismatch → error
  • Identical, opposite, orthogonal, single-element, 1024-d vectors → correct results
  • Updated test_hvector_distance_max to use opposite vectors (was zero-magnitude)
  • Updated upsert test to expect error on empty vector data

Related Issues

Checklist when merging to main

  • No compiler warnings (if applicable)
  • Code is formatted with rustfmt
  • No useless or dead code (if applicable)
  • Code is easy to understand
  • Doc comments are used for all functions, enums, structs, and fields (where appropriate)
  • All tests pass
  • Performance has not regressed (assuming change was not to fix a bug)
  • Version number has been updated in helix-cli/Cargo.toml and helixdb/Cargo.toml

Additional Notes

The Cleanup artifacts CI job fails with HTTP 403 — this is a workflow permissions issue in the repo infrastructure, unrelated to these code changes.

4 files changed, +125/−42. All 1411 lib tests pass.


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI and others added 2 commits March 7, 2026 13:43
…, fix typos, add edge case tests

Co-authored-by: bhaktofmahakal <113044681+bhaktofmahakal@users.noreply.github.com>
…es, add near-zero test

Co-authored-by: bhaktofmahakal <113044681+bhaktofmahakal@users.noreply.github.com>
Copilot AI changed the title [WIP] Analyze codebase and identify system weaknesses Harden vector core: validate empty/zero-magnitude vectors, fix NaN propagation Mar 7, 2026
Copilot AI changed the title Harden vector core: validate empty/zero-magnitude vectors, fix NaN propagation fix(vector-core): harden cosine similarity against zero-magnitude and empty vectors Mar 7, 2026
…alidation with scalar path

Address review feedback from Copilot and Greptile reviewers on PR HelixDB#881:
- Change cosine_similarity_avx2 return type from f64 to Result<f64, VectorError>
- Add zero/near-zero magnitude check (f64::EPSILON threshold) in AVX2 path
- Add NaN/Infinity guard in AVX2 path
- Ensures consistent behavior across scalar and SIMD code paths

Co-authored-by: bhaktofmahakal <113044681+bhaktofmahakal@users.noreply.github.com>
Copilot AI changed the title fix(vector-core): harden cosine similarity against zero-magnitude and empty vectors fix(vector-core): harden cosine similarity — input validation, zero-magnitude handling, AVX2 parity Mar 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants