Rust: builder-based params, per-module errors#2284
Conversation
| /// Canberra distance. | ||
| Canberra, | ||
| /// Generalized Minkowski (Lp) distance with exponent `p`. | ||
| LpUnexpanded(f32), |
There was a problem hiding this comment.
This isn't a big deal, but might be worth noting -- the exponent p is not always honored. Over in the C code, this is only moved when metric_arg exists. Might be worth a comment...it's not a regression from the existing C code and I'm not sure how to address it without a major refactor.
There was a problem hiding this comment.
In the C++ implementation the exponent p is read only when the LpUnexpanded metric is requested, and the only two places that don't reject this metric are pairwise_distance and brute_force. Both of them do read the value of p. So in this sense, the p is always honored by algos that support the Lp metric, even though the C API obscures this fact.
|
|
||
| pub use index::Index; | ||
| pub use params::{ | ||
| IndexParams, SearchParams, cudaDataType_t, cuvsIvfPqCodebookGen, cuvsIvfPqListLayout, |
There was a problem hiding this comment.
You probably don't want to export these native cudaDataType_t, cuvsIvfPqCodebookGen, and cuvsIvfPqListLayout types publicly, since everything else is wrapped.
| vq_kmeans_trainset_fraction: Option<f64>, | ||
| pq_kmeans_trainset_fraction: Option<f64>, | ||
| ) -> Result<Self, CagraError> { | ||
| if let Some(bits) = pq_bits |
There was a problem hiding this comment.
This applies to other news in this file, too: Does the C library have a facility to handle validation of these FFI params for you? It might be worth using to avoid duplicating work.
There was a problem hiding this comment.
The internal C++ impl can validate some of those params. I'd rather not split the validation logic across FFI. It makes for cleaner errors and keeps it in one place. Ideally, I'd drop them all, but then we need to strengthen the C side validation.
| ) -> Result<Self, CagraError> { | ||
| let params = Self::try_new()?; | ||
|
|
||
| let effective_algo = algo.unwrap_or(unsafe { (*params.handle).algo.into() }); |
There was a problem hiding this comment.
Very minor, but prefer .unwrap_or_else(|| unsafe { (*params.handle).algo.into() }) so you only deref in the None case. unwrap_or eagerly evaluates even when algo is Some(_).
|
|
||
| impl IndexParams { | ||
| /// Returns a new IndexParams | ||
| pub fn new() -> Result<IndexParams> { |
There was a problem hiding this comment.
Since this is deleted, the example over in the README Rust API that uses IndexParams::new() won't work any more. For Rust, there's a clever trick you can use to make sure your READMEs always compile:
#[cfg(doctest)]
#[doc = include_str!("../../README.md")]
pub struct ReadmeDocTests;This picks up the triple-ticks and makes sure they compile (https://doc.rust-lang.org/rustdoc/write-documentation/documentation-tests.html). Rust is the assumed language, but the examples currently all specify c/python/rust/etc -- Rust doctest will ignore non-Rust languages.
There are some other things in examples/rust that use this same pattern that fail to compile. CI can catch this for you by running cargo build --examples
Here we further rework the Rust bindings API to be more idiomatic, and also align closer with the underlying C++ namespacing. More specifically:
new()+ chainedset_*. Unset fields keep the C-library defaults; out-of-range values are rejected bybuild().Thus, by the time the params instance is constructed, it is guaranteed to be valid.
unsafe { Resources::with_stream(stream) }. TheResources::new()keeps the default stream.cuvs::neighbors.cuvs::Error/cuvs::Resultare removed. Each module returns its ownthiserrorenum, e.g.CagraErrororIvfFlatError. This enables individual index errors to be more granular, instead of creating a single error type encompassing all failure modes.