Skip to content

feat: add MiniMax as embedding provider#894

Open
octo-patch wants to merge 2 commits intoHelixDB:mainfrom
octo-patch:feature/add-minimax-embedding-provider
Open

feat: add MiniMax as embedding provider#894
octo-patch wants to merge 2 commits intoHelixDB:mainfrom
octo-patch:feature/add-minimax-embedding-provider

Conversation

@octo-patch
Copy link
Copy Markdown

@octo-patch octo-patch commented Mar 22, 2026

Summary

  • Add MiniMax AI as a built-in embedding provider, supporting the embo-01 model (1536 dimensions)
  • Implement native MiniMax embedding API format (texts array + type field for db/query distinction, vectors response)
  • Handle MiniMax's HTTP 200 error responses by checking base_resp.status_code in the response body

Usage

// Storage embeddings (type defaults to "db")
embed!("Hello, world!", "minimax:embo-01");

// Search query embeddings
embed!("search query", "minimax:embo-01:query");

Requires MINIMAX_API_KEY environment variable (or pass key directly).

Changes

File Change
helix-db/src/helix_gateway/embedding_providers/mod.rs Add MiniMax variant to EmbeddingProvider enum, minimax: prefix parsing, API key resolution, and fetch_embedding_async() implementation
helix-db/src/helix_gateway/tests/embedding_providers.rs Add 5 unit tests (parsing, API key validation) + 2 integration tests
helix-db/src/helix_engine/tests/README.md Document new MiniMax embedding tests

MiniMax API Details

  • Endpoint: POST https://api.minimax.io/v1/embeddings
  • Model: embo-01 (1536 dimensions)
  • Request: {"model": "embo-01", "texts": ["text"], "type": "db"}
  • Response: {"vectors": [[...]], "base_resp": {"status_code": 0}}
  • Type parameter: "db" for storage, "query" for search queries

Test plan

  • All 25 existing + new unit tests pass (cargo test embedding_providers)
  • New MiniMax parsing tests: model, type, empty model
  • New MiniMax initialization tests: missing key fails, explicit key succeeds
  • Integration tests added (ignored by default, require MINIMAX_API_KEY)
  • Full project build succeeds

Greptile Summary

This PR adds MiniMax as a new embedding provider following the existing pattern for OpenAI, Gemini, and Azure OpenAI, including proper API key resolution, a custom request/response format (texts array + type field + vectors response), and handling of MiniMax's HTTP-200-for-errors behaviour via base_resp.status_code.

Key points:

  • The implementation is structurally sound and correctly mirrors the conventions used by other providers.
  • Dead default fallback: parts.get(1).unwrap_or(&"embo-01") on the model-parsing path (line 173) is unreachable — splitn(2, ':') always produces Some("") when the user writes "minimax:", so the intended default "embo-01" is never applied and an empty model string is forwarded to the API. The accompanying test documents this broken behaviour by asserting model == "".
  • No embedding_type validation: The type field forwarded to MiniMax is not checked against the two accepted values ("db", "query"); invalid types will only fail at runtime with an opaque API error.
  • Integration test inconsistency: New MiniMax integration tests use #[tokio::test] + fetch_embedding_async, while all existing integration tests use plain #[test] + the synchronous fetch_embedding wrapper.

Important Files Changed

Filename Overview
helix-db/src/helix_gateway/embedding_providers/mod.rs Adds MiniMax to EmbeddingProvider enum and implements full parse + fetch logic; has a dead unwrap_or(&"embo-01") default that never fires, causing "minimax:" to silently send an empty model name to the API.
helix-db/src/helix_gateway/tests/embedding_providers.rs Adds 5 unit tests and 2 integration tests for MiniMax; the empty-model unit test asserts the buggy behaviour ("" instead of "embo-01"), and integration tests use #[tokio::test] inconsistently with the rest of the test suite.
helix-db/src/helix_engine/tests/README.md Documentation-only addition of two new MiniMax integration test names; no issues.

Sequence Diagram

sequenceDiagram
    participant User
    participant embed_macro as embed! macro
    participant Parser as parse_provider_and_model
    participant EmbeddingModelImpl
    participant MiniMaxAPI as MiniMax API (api.minimax.io)

    User->>embed_macro: embed!(text, "minimax:embo-01[:query]")
    embed_macro->>Parser: parse_provider_and_model(Some("minimax:embo-01"))
    Parser-->>EmbeddingModelImpl: (MiniMax { embedding_type: "db" }, "embo-01")
    embed_macro->>EmbeddingModelImpl: fetch_embedding(text)
    EmbeddingModelImpl->>MiniMaxAPI: POST /v1/embeddings\n{model, texts:[text], type:"db"}
    MiniMaxAPI-->>EmbeddingModelImpl: HTTP 200 {vectors:[[...]], base_resp:{status_code:0}}
    Note over EmbeddingModelImpl: Check base_resp.status_code != 0\n(MiniMax returns HTTP 200 even for errors)
    EmbeddingModelImpl-->>embed_macro: Vec<f64> (1536 dims)
    embed_macro-->>User: embedding vector
Loading

Reviews (1): Last reviewed commit: "feat: add MiniMax as embedding provider" | Re-trigger Greptile

Greptile also left 2 inline comments on this PR.

Add MiniMax AI as a built-in embedding provider alongside OpenAI,
Gemini, AzureOpenAI, and Local.

MiniMax's embo-01 model produces 1536-dimensional embeddings via a
native API at https://api.minimax.io/v1/embeddings. The API uses a
different request/response format from OpenAI (texts array + type
field for db/query distinction, vectors array in response).

Usage:
  embed!("text", "minimax:embo-01")           // storage embeddings
  embed!("text", "minimax:embo-01:query")     // search query embeddings

Configuration:
  - Set MINIMAX_API_KEY environment variable
  - Provider prefix: minimax:<model>[:<type>]
  - type defaults to "db" (use "query" for search queries)

Includes 5 unit tests (parsing, API key validation) and 2 integration
tests (ignored by default, require API key and network).
Comment on lines +171 to +185
Some(m) if m.starts_with("minimax:") => {
let parts: Vec<&str> = m.splitn(2, ':').collect();
let model_and_type = parts.get(1).unwrap_or(&"embo-01");
let (model_name, embedding_type) = if model_and_type.contains(':') {
let type_parts: Vec<&str> = model_and_type.splitn(2, ':').collect();
(
type_parts[0].to_string(),
type_parts.get(1).unwrap_or(&"db").to_string(),
)
} else {
(model_and_type.to_string(), "db".to_string())
};

Ok((EmbeddingProvider::MiniMax { embedding_type }, model_name))
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Dead unwrap_or fallback — default "embo-01" is never applied

Because the match guard already requires m.starts_with("minimax:"), calling m.splitn(2, ':') will always produce at least two parts: ["minimax", "<rest>"]. This means parts.get(1) always returns Some(...) — even Some("") when the input is "minimax:". The .unwrap_or(&"embo-01") fallback is therefore dead code and the intended default is never applied.

Concretely, "minimax:" results in model_name = "" (not "embo-01"), which will be sent verbatim to the MiniMax API and cause a server-side error rather than a clean default.

The same dead-fallback pattern exists in the Gemini arm (line 125), but this PR introduces it again for MiniMax.

A clean fix is to use strip_prefix and treat an empty suffix as the default:

Suggested change
Some(m) if m.starts_with("minimax:") => {
let parts: Vec<&str> = m.splitn(2, ':').collect();
let model_and_type = parts.get(1).unwrap_or(&"embo-01");
let (model_name, embedding_type) = if model_and_type.contains(':') {
let type_parts: Vec<&str> = model_and_type.splitn(2, ':').collect();
(
type_parts[0].to_string(),
type_parts.get(1).unwrap_or(&"db").to_string(),
)
} else {
(model_and_type.to_string(), "db".to_string())
};
Ok((EmbeddingProvider::MiniMax { embedding_type }, model_name))
}
Some(m) if m.starts_with("minimax:") => {
let suffix = m.strip_prefix("minimax:").unwrap_or("embo-01");
let suffix = if suffix.is_empty() { "embo-01" } else { suffix };
let (model_name, embedding_type) = if suffix.contains(':') {
let type_parts: Vec<&str> = suffix.splitn(2, ':').collect();
(
type_parts[0].to_string(),
type_parts.get(1).unwrap_or(&"db").to_string(),
)
} else {
(suffix.to_string(), "db".to_string())
};
Ok((EmbeddingProvider::MiniMax { embedding_type }, model_name))
}

}
_ => panic!("Expected MiniMax provider"),
}
assert_eq!(model, "");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 test_parse_minimax_provider_empty_model asserts empty model rather than the documented default

This test asserts model == "" for input "minimax:". Combined with the dead unwrap_or(&"embo-01") in the parser, this test documents and entrenches the broken behaviour: a user who types "minimax:" expecting the default model will silently get an empty model string sent to the API. If the dead-fallback bug in the parser is fixed, this test should be updated to assert model == "embo-01".

Suggested change
assert_eq!(model, "");
assert_eq!(model, "embo-01"); // defaults to embo-01 when no model specified

MiniMax embedding API returns HTTP 200 even for errors (e.g. rate limits)
with the actual error in base_resp.status_code. Check this field before
attempting to parse the embedding vectors.

Also fix integration tests to use #[tokio::test] for proper async runtime.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant