Skip to content

feat(embeddings): added model2vec #778

Open
vrn21 wants to merge 6 commits intoHelixDB:devfrom
vrn21:rust2vec
Open

feat(embeddings): added model2vec #778
vrn21 wants to merge 6 commits intoHelixDB:devfrom
vrn21:rust2vec

Conversation

@vrn21
Copy link
Copy Markdown

@vrn21 vrn21 commented Dec 19, 2025

Description

Add model2vec-rs as 4th embedding provider

Closes #721

Summary

Adds model2vec-rs as a new embedding provider for free, local, offline embedding generation without API keys or external servers.

Changes

Dependencies

  • Added model2vec-rs = { version = "0.1", optional = true } to Cargo.toml
  • Added model2vec = ["model2vec-rs"] feature flag

Implementation (115 lines across 3 files)

helix-db/src/helix_gateway/embedding_providers/mod.rs:

  • Added Model2Vec { model_name: String } variant to EmbeddingProvider enum
  • Added model2vec: Option<StaticModel> field to EmbeddingModelImpl (feature-gated)
  • Implemented model loading in constructor via StaticModel::from_pretrained()
  • Implemented fetch_embedding_async() using tokio::task::spawn_blocking() for sync→async conversion
  • Added parser for "model2vec:{model}" prefix (default: minishlab/potion-base-32M)
  • Added comprehensive inline documentation (68 lines module docs + provider-specific comments)
  • f32→f64 conversion for HelixDB compatibility

helix-db/src/helix_gateway/tests/embedding_providers.rs:

  • Added test_parse_model2vec_provider - validates parsing with explicit model
  • Added test_parse_model2vec_default - validates default model fallback
  • Added test_model2vec_embedding (#[ignore]) - integration test requiring model download

Technical Details

Model Loading:

  • Models downloaded from HuggingFace Hub on first use
  • Cached in ~/.cache/huggingface/
  • Loaded once in constructor, reused for all embeddings
  • StaticModel is Clone (Arc-based, cheap)

Async Handling:

  • encode_single() is sync/CPU-bound
  • Wrapped in tokio::task::spawn_blocking() to avoid blocking async runtime
  • Returns Vec<f64> like other providers

Available Models:

  • minishlab/potion-base-2M (2MB, 256 dims)
  • minishlab/potion-base-8M (8MB, 256 dims)
  • minishlab/potion-base-32M (32MB, 768 dims) [default]
  • minishlab/potion-retrieval-32M (32MB, 768 dims)

Testing

# Unit tests
cargo test --lib --features model2vec embedding_providers
# Result: 17 passed, 0 failed, 5 ignored

# Build verification
cargo build --features server,model2vec
# Result: Success, no warnings

## Usage
# Feature flag:

```bash
cargo build --features server,model2vec

Configuration (config.hx.json):

{
  "embedding_model": "model2vec:minishlab/potion-base-32M"
} 

HelixQL:

QUERY search(query: String) =>
    results <- SearchV<Document>(Embed(query), 10)
    RETURN results

Breaking Changes

None. All changes are additive:

New feature flag (opt-in)
New enum variant (non-breaking)
New optional field (feature-gated)
Existing providers unchanged

Checklist when merging to main

  • No compiler warnings (if applicable)
  • Code is formatted with rustfmt
  • No useless or dead code (if applicable)
  • Code is easy to understand
  • Doc comments are used for all functions, enums, structs, and fields (where appropriate)
  • All tests pass
  • Performance has not regressed (assuming change was not to fix a bug)
  • Version number has been updated in helix-cli/Cargo.toml and helixdb/Cargo.toml

Additional Notes

Greptile Summary

This PR successfully adds model2vec-rs as a fourth embedding provider, enabling free, local, offline embedding generation without API keys. The implementation follows existing patterns for other providers with proper feature gating, comprehensive documentation, and async handling.

Key Changes:

  • Added optional model2vec-rs dependency with feature flag in helix-db/Cargo.toml
  • Implemented Model2Vec variant in EmbeddingProvider enum with model loading via StaticModel::from_pretrained()
  • Used tokio::task::spawn_blocking() to handle synchronous encode_single() without blocking async runtime
  • Added f32→f64 conversion for HelixDB compatibility
  • Created parsing logic for "model2vec:{model}" format with default minishlab/potion-base-32M
  • Added 68 lines of comprehensive module-level documentation explaining all four providers
  • Included three unit tests (two for parsing, one integration test marked #[ignore])

Minor Issues Found:

  • One test (test_parse_model2vec_default) doesn't fully assert the returned model value, though this is a minor style issue

Important Files Changed

Filename Overview
helix-db/Cargo.toml added optional model2vec-rs dependency and model2vec feature flag
helix-db/src/helix_gateway/embedding_providers/mod.rs implemented Model2Vec provider with comprehensive documentation, async handling, and proper feature gating
helix-db/src/helix_gateway/tests/embedding_providers.rs added three tests for Model2Vec provider parsing and embedding generation, but found one issue with test assertion

Sequence Diagram

sequenceDiagram
    participant User
    participant Config
    participant EmbeddingModelImpl
    participant StaticModel
    participant TokenioRuntime
    participant ThreadPool

    User->>Config: configure model2vec provider
    User->>EmbeddingModelImpl: new(api_key, model, url)
    EmbeddingModelImpl->>EmbeddingModelImpl: parse_provider_and_model()
    EmbeddingModelImpl->>EmbeddingModelImpl: extract model name
    EmbeddingModelImpl->>StaticModel: from_pretrained(model_name)
    Note over StaticModel: Downloads from HuggingFace<br/>Cached locally
    StaticModel-->>EmbeddingModelImpl: StaticModel instance
    EmbeddingModelImpl-->>User: EmbeddingModelImpl ready

    User->>EmbeddingModelImpl: fetch_embedding_async(text)
    EmbeddingModelImpl->>EmbeddingModelImpl: match Model2Vec provider
    EmbeddingModelImpl->>EmbeddingModelImpl: clone text and model
    EmbeddingModelImpl->>TokenioRuntime: spawn_blocking(encode_single)
    TokenioRuntime->>ThreadPool: schedule blocking task
    ThreadPool->>StaticModel: encode_single(text)
    StaticModel-->>ThreadPool: Vec f32 embedding
    ThreadPool->>ThreadPool: convert f32 to f64
    ThreadPool-->>TokenioRuntime: Vec f64 embedding
    TokenioRuntime-->>EmbeddingModelImpl: Result with embedding
    EmbeddingModelImpl-->>User: Vec f64 embedding
Loading

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional Comments (1)

  1. helix-db/src/helix_gateway/tests/embedding_providers.rs, line 150 (link)

    style: the returned model value is not being asserted. The test should verify the model string matches the default

    then add after line 156:

3 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@xav-db xav-db changed the base branch from main to dev December 22, 2025 14:42
@xav-db
Copy link
Copy Markdown
Member

xav-db commented Dec 22, 2025

@vrn21 resolve conflicts please

@vrn21
Copy link
Copy Markdown
Author

vrn21 commented Dec 25, 2025

Sorry for the delay, have resolved the conflicts!

Copy link
Copy Markdown
Member

@xav-db xav-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@xav-db
Copy link
Copy Markdown
Member

xav-db commented Jan 9, 2026

please fix clippy check @vrn21

@vrn21
Copy link
Copy Markdown
Author

vrn21 commented Jan 9, 2026

Clippy fixed @xav-db

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Add model2vec

2 participants