fix(mcp): fail loudly when embedder initialization fails#1414
fix(mcp): fail loudly when embedder initialization fails#1414chrisgscott wants to merge 2 commits intogetzep:mainfrom
Conversation
Race condition: after laptop restart, Claude Code initializes MCP servers before Docker Desktop finishes starting FalkorDB. Server would throw RuntimeError and get silently dropped. Now retries 5 times with exponential backoff (2s, 4s, 8s, 16s, 32s) for connection refused errors. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previously, if the embedder client failed to initialize (e.g., invalid API key, network issue), the MCP server logged a warning and continued with embedder=None. This caused search methods (search_nodes, search_memory_facts) to silently return empty results instead of errors, since hybrid search requires embeddings to generate query vectors. The failure mode was especially confusing because writes (add_memory) continued to work normally — they use the LLM client, not the embedder. Users saw data being written successfully but could never read it back. Changes: - Embedder initialization now raises RuntimeError on failure instead of logging a warning and continuing - Added search health check at startup that verifies the embedder can produce vectors end-to-end - Added defensive checks in search_nodes and search_memory_facts that return a clear error if the embedder is unexpectedly None Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
I have read the CLA Document and I hereby sign the CLA You can retrigger this bot by commenting recheck in this Pull Request. Posted by the CLA Assistant Lite bot. |
xkonjin
left a comment
There was a problem hiding this comment.
Good iterative improvements here. Some concerns before merging:
-
Critical: Silent fallback on embedder health check. In
initialize(), ifembedder_client.create(['health check'])throws, you catch it and log a warning that "Search may not work correctly." Butembedder_clientwas already created successfully above — this branch means the runtime embedder call failed (network, quota, auth). That is a real failure mode. Logging a warning and continuing means users will see confusing search errors later instead of a clear startup failure. Consider whether this should be a hard error, or at minimum expose a /health endpoint that reports it so operators can detect it. -
Model string assumptions.
model.startswith('gpt-5')will match any futuregpt-5.xmodel, which is okay today, but it is fragile if OpenAI releases a non-reasoninggpt-5variant or changes naming. There is no action needed now, but consider centralizing reasoning-model detection into a shared helper so you don't have to update everystartswithcall later. -
Missing tests for token param mapping.
_create_completionnow switches betweenmax_tokensandmax_completion_tokensbased onis_reasoning_model. Add a unit test that asserts the correct kwarg is sent for both reasoning and non-reasoning model strings. -
Retry loop doubles imports. You import
asyncio as _asyncioinside the retry loop body. Move that to the top of the file with the other imports. -
Config YAMLs contain hardcoded user.
config-second-brain.yamlandconfig-second-brain-stdio.yamlboth setuser_id: chris. If these are committed examples, they should either be parameterized (${USER_ID:chris}already exists in the same file) or the hardcoded default should be something neutral likedefault_user.
Fix the import placement and decide on the embedder health-check failure semantics before merging.
Summary
embedder=Nonesearch_nodesandsearch_memory_factsthat return a clear error if embedder is unexpectedlyNoneProblem
When the embedder client fails to initialize (invalid API key, network issue, etc.), the MCP server logs a warning and continues with
embedder_client = None. This causessearch_nodesandsearch_memory_factsto silently return empty results — the hybrid search (NODE_HYBRID_SEARCH_RRF) needs the embedder to generate query vectors for the cosine similarity leg, and without it, RRF reranking produces nothing.The failure mode is especially confusing because
add_memorycontinues to work normally (it uses the LLM client, not the embedder). Users see data being written successfully but get empty results on every search, with no error indicating why.Test plan
search_nodeswhen embedder is available — verify results are returned normallyNoneat search time, verify the tool returns an explicit error message instead of empty results🤖 Generated with Claude Code