Skip to content

feat: retire Python embedder + gRPC backend (PR-1.J)#300

Merged
toddwbucy merged 4 commits into
mainfrom
feat/retire-python-embedder
May 8, 2026
Merged

feat: retire Python embedder + gRPC backend (PR-1.J)#300
toddwbucy merged 4 commits into
mainfrom
feat/retire-python-embedder

Conversation

@toddwbucy
Copy link
Copy Markdown
Owner

@toddwbucy toddwbucy commented May 8, 2026

Phase 1 cleanup. Now that the in-process candle `EmbedderClient` (PR #294), daemon backend selection (PR #296), herobench trait refactor (PR #298), and candle peer-access fix (#299) are all in main, the Python `weaver-embedder.service` and its gRPC client are redundant infrastructure. This PR removes them entirely.

Summary

Removed Reason
`services/embedder/` (3.1 GB Python tree) Python service no longer needed
`services/embedder/weaver-embedder.service` systemd unit retired
`crates/weaver-spu/src/encoder/grpc_client_legacy.rs` (~600 LOC) gRPC client to Python service
`proto/persephone/{common,embedding,extraction,training}/*.proto` No production callers
`crates/weaver-database/{build.rs, src/proto.rs, tests/proto_types.rs}` proto codegen + sole test
`crates/weaver-demo/src/bin/embedder_latency.rs` gRPC-stack latency benchmark
`crates/weaver-demo/tests/similarity_calibration.rs` gRPC similarity test
Tonic stack from `weaver-spu` and `weaver-database` Cargo.toml `tonic`, `prost`, `prost-types`, `hyper-util` (spu only), `tower` (spu only), `tonic-build`

Simplified

  • `weaver-spu/build.rs` — drops Persephone proto compile (CUDA kernels unchanged).
  • `weaver-spu/src/decoder/multi_model.rs` — `EmbedderBackend` enum gone. `EmbedderConfig.backend` field gone. `validate()` requires `snapshot` unconditionally; Python-only checks (locked-GPU, non-empty socket/model_name, batch_size>=1) retired with the rules they enforced. Four obsolete tests removed; the three snapshot-validation tests renamed.
  • `weaver-core/src/embedder.rs` — drops `Fromtonic::*` impls (no consumers).
  • `weaver-interface/src/serve.rs` — drops `probe_embedder_python`, `EmbedderBackend` match, and all gRPC imports. Boot path is now: parse config → embedder probe (rust only) → pin verify → build server state → listen.
  • `weaver-demo/src/herobench/benchmark.rs` — `try_construct_embedder` collapses to "in-process or None"; the gRPC fallback helper is gone.
  • `weaver-demo/tests/herobench_integration.rs` — `try_connect_embedder` rewritten to construct an in-process `EmbedderClient`. Three `integration_dedup_*` tests + their helpers cfg-gated under `feature = "embedder-rust"`.
  • `services/weaver-infer.service` — drops `Requires=`/`BindsTo=`/`After=weaver-embedder.service`.
  • `services/weaver.target` — drops `Wants=weaver-embedder.service`/`After=`.
  • `scripts/install.sh` — drops the `weaver-embedder.service` install step; explicitly removes a stale unit if a previous install left it behind.

EmbedderConfig back-compat

The `socket`, `model_name`, `batch_size`, `use_fp16`, `idle_timeout_seconds`, and `gpu` fields stay on `EmbedderConfig` (deprecated but preserved) so existing `server.toml` files still parse without erroring. They're no longer load-bearing — only `snapshot` is read.

Test plan

  • `cargo check --workspace` — clean
  • `cargo check -p weaver-interface --features inference,embedder-rust` — clean
  • `cargo test -p weaver-spu --lib -- test_validate_embedder` — 5 passed
  • CodeRabbit review clean
  • (manual, post-merge) `weaver serve` against the existing pin file at `/opt/weaver/state/embedder.pin.json` should still match identity (in-process EmbedderClient produces the same identity values per PR fix(spu): kernel restore + Jina V4 trained-context cap (PR-1.H) #297 / smoke evidence)

Known pre-existing failures (not introduced by this PR)

The 14 `weaver-core::engine::runtime::tests::spawn_*` failures (`ModelLoadFailed("Model 'test-model' is not loaded...")`) are unchanged from main. They require a live inference server with a `test-model` loaded — infra not provided by the test harness.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • In-process, snapshot-based embedder client is now the primary supported embedding backend.
  • Bug Fixes / Improvements

    • Startup now probes the in-process embedder and enforces snapshot/device configuration; improved startup guidance and tightened probing.
    • Installer and systemd units updated to remove an external embedder service.
  • Removed Features

    • Legacy gRPC/protobuf embedder APIs, generated bindings, standalone embedder service, related configs/tests, and legacy proto contracts.

Phase 1 cleanup. The in-process candle `EmbedderClient` (PR #294)
plus the daemon backend selection (PR #296) plus the herobench
trait refactor (PR #298) plus the candle peer-access fix (#299)
together made the Python `weaver-embedder.service` and its gRPC
client redundant. This PR removes them entirely.

## Removed

- `services/embedder/` — Python service tree (3.1 GB on disk
  including the .venv with PyTorch/transformers).
- `services/embedder/weaver-embedder.service` — systemd unit.
- `crates/weaver-spu/src/encoder/grpc_client_legacy.rs` (~600 LOC)
  and the matching test in `crates/weaver-spu/tests/`.
- `proto/persephone/{common,embedding,extraction,training}/*.proto`
  — Persephone proto definitions, no production callers.
- `crates/weaver-database/{build.rs,src/proto.rs,tests/proto_types.rs}`
  — proto codegen + the only test that referenced the proto types
  (proto-validation, no production caller).
- `crates/weaver-demo/src/bin/embedder_latency.rs` — gRPC-stack
  latency benchmark binary.
- `crates/weaver-demo/tests/similarity_calibration.rs` — gRPC
  similarity calibration test.
- Tonic stack from `weaver-spu/Cargo.toml` (`tonic`, `prost`,
  `prost-types`, `hyper-util`, `tower`, `tonic-build`).
- Tonic stack from `weaver-database/Cargo.toml` (same deps + the
  `tonic-build` build-dep).

## Simplified

- `weaver-spu/build.rs` — drops the Persephone proto compile step;
  CUDA-kernel compile path unchanged.
- `weaver-spu/src/encoder/mod.rs` — drops `grpc_client_legacy`
  module reference.
- `weaver-spu/src/lib.rs` — drops the `proto` module.
- `weaver-spu/src/decoder/multi_model.rs` — drops `EmbedderBackend`
  enum (only one backend now), drops `backend` field from
  `EmbedderConfig`. Validate() requires `[embedder].snapshot`
  unconditionally; the old Python-only checks (locked-GPU,
  non-empty socket, batch_size > 0, model_name non-empty)
  retired alongside the rules they enforced. Tests updated.
- `weaver-core/src/embedder.rs` — drops `From<tonic::transport::Error>`
  and `From<tonic::Status>` impls (no consumers now).
- `weaver-interface/src/serve.rs` — drops `probe_embedder_python`,
  `EmbedderBackend` match, and all gRPC client imports. Boot path
  is now: parse config → embedder probe (rust only) → pin verify
  → build server state → listen.
- `weaver-demo/src/herobench/benchmark.rs` — `try_construct_embedder`
  collapses to "in-process or None"; the gRPC fallback helper
  `try_construct_embedder_grpc` is gone.
- `weaver-demo/tests/herobench_integration.rs` — `try_connect_embedder`
  rewritten to construct an in-process `EmbedderClient`. The three
  `integration_dedup_*` tests + their helpers cfg-gated under
  `feature = "embedder-rust"` (they require a Jina V4 snapshot).
- `services/weaver-infer.service` — drops `Requires=`/`BindsTo=`/
  `After=weaver-embedder.service`; embedder is in-process now.
- `services/weaver.target` — drops `Wants=weaver-embedder.service`
  and `After=`.
- `scripts/install.sh` — drops the `weaver-embedder.service` install
  step; explicitly removes a stale unit if a previous install left
  it behind. Updates the operator-facing "next steps" message to
  reflect the two-unit chain.

## EmbedderConfig back-compat

The `socket`, `model_name`, `batch_size`, `use_fp16`,
`idle_timeout_seconds`, and `gpu` fields stay on `EmbedderConfig`
(deprecated but preserved) so existing `server.toml` files still
parse. They're no longer load-bearing — `snapshot` is the only
field the daemon actually reads.

## Validation

- `cargo check --workspace` — clean
- `cargo check -p weaver-interface --features inference,embedder-rust` — clean
- `cargo test -p weaver-spu --lib -- test_validate_embedder` — 5 passed (3 new + 2 sibling)

The 14 pre-existing `weaver-core::engine::runtime::tests::spawn_*`
failures are unchanged from main; they require a live inference
server with a "test-model" loaded and aren't introduced by this PR.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 8, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

This PR removes the legacy external Python embedder service and Persephone gRPC infrastructure, consolidating all embedding operations onto an in-process Rust-backed client. The change eliminates tonic/protobuf dependencies across multiple crates, deletes the services/embedder directory entirely, removes the gRPC client implementation, and rewires all consumers to use snapshot-based embedder construction.

Changes

Retire Legacy gRPC Embedder, Migrate to In-Process Rust Backend

Layer / File(s) Summary
Proto File Definitions Removal
proto/persephone/common/common.proto, proto/persephone/embedding/embedding.proto, proto/persephone/extraction/extraction.proto, proto/persephone/training/training.proto
Deletes all Persephone gRPC/protobuf schema definitions including ChunkingStrategy, ChunkMetadata, EmbeddingService, ExtractionService, and TrainingService.
Tonic/Protobuf Dependencies & Build Scripts
crates/weaver-database/Cargo.toml, crates/weaver-spu/Cargo.toml, crates/weaver-database/build.rs, crates/weaver-spu/build.rs, services/embedder/build_proto.sh
Removes tonic/prost/prost-types and tonic-build entries; removes proto compilation from Rust build scripts and the Python stub regeneration script.
Proto Module Exports & Tests Removal
crates/weaver-database/src/lib.rs, crates/weaver-database/src/proto.rs, crates/weaver-database/tests/proto_types.rs, crates/weaver-spu/src/lib.rs
Removes public proto module declarations/re-exports and deletes tests that referenced generated protobuf types.
weaver-core: Tonic Error Conversion Removal
crates/weaver-core/src/embedder.rs
Removes From<tonic::transport::Error> and From<tonic::Status> trait implementations for EmbeddingError.
Legacy gRPC Client Implementation Removal
crates/weaver-spu/src/encoder/grpc_client_legacy.rs
Removes tonic-based EmbeddingClient implementation and its Embedder trait impl, config types, and public APIs.
Legacy gRPC Client Tests Removal
crates/weaver-spu/tests/grpc_client_legacy.rs
Deletes the comprehensive test suite including mock server fixtures and transport/error-mapping tests.
Encoder Module Cleanup
crates/weaver-spu/src/encoder/mod.rs
Removes pub mod grpc_client_legacy; export and adds retirement documentation.
weaver-spu: build & decoder changes
crates/weaver-spu/Cargo.toml, crates/weaver-spu/build.rs, crates/weaver-spu/src/decoder/multi_model.rs
Updates dependencies/build-deps and simplifies [embedder] config to require snapshot, removing backend enum and GPU policy validation; updates unit tests.
Demo: In-process Embedder Migration & Test Gating
crates/weaver-demo/src/herobench/benchmark.rs, crates/weaver-demo/tests/herobench_integration.rs
Rewrites embedder construction to use only in-process Rust EmbedderClient loaded from WEAVER_SPU_JINA_V4_SNAPSHOT; removes gRPC fallback; gates integration tests on embedder-rust.
Obsolete Benchmark and Test Removal
crates/weaver-demo/src/bin/embedder_latency.rs, crates/weaver-demo/tests/similarity_calibration.rs
Removes the latency benchmark binary and the live similarity calibration test.
Python Embedder Service Deletion
services/embedder/*
Removes entire embedder service directory: configuration, embedder implementations, HTTP/gRPC services, generated Python protos, tests, requirements, and systemd unit file.
Interface: serve.rs probe changes
crates/weaver-interface/src/serve.rs
Rewires startup to probe the in-process Rust embedder only (probe_embedder_rust), removes Python probe helper, and updates startup messages.
Installer & Systemd Unit Updates
scripts/install.sh, services/weaver-infer.service, services/weaver.target
Installer stops installing weaver-embedder.service, removes stale unit files, and systemd units no longer order or require the separate embedder service.
Misc / Small Files
services/embedder/tests/conftest.py, services/embedder/requirements.txt, module headers
Removes test path hack, requirements, and minor module header comments related to the retired embedder service.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

Possibly related PRs

  • toddwbucy/WeaverTools#50: Overlaps on proto/build/Cargo items; both modify proto generation and workspace proto handling.
  • toddwbucy/WeaverTools#296: Related changes to embedder backend selection and probe flow; this PR removes Python/gRPC paths that #296 previously addressed.
  • toddwbucy/WeaverTools#294: Related to transport error mapping and in-process embedder error conversions; both touch EmbeddingError conversion surfaces.

Poem

🐇 Proto files rust away,
Legacy services fade to gray,
In-process embeddings bloom so bright,
Snapshots load with steady might,
One embedder, streamlined way!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: retire Python embedder + gRPC backend (PR-1.J)' clearly and specifically describes the main change: removing the Python embedder service and its gRPC backend infrastructure.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
crates/weaver-interface/src/serve.rs (2)

77-82: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Stale error message references retired service.

The error message at line 80-81 tells operators to "Bring weaver-embedder.service up first" but this PR retires that service. Since the embedder is now in-process, this guidance is no longer applicable.

Suggested fix
                 Ok(_) => bail!(
-                    "Refuse to start: embedder probe failed and a pin exists at {}. \
-                     The cohort identity cannot be verified without the embedder. \
-                     Bring weaver-embedder.service up first, then retry.",
+                    "Refuse to start: embedder probe failed and a pin exists at {}. \
+                     The cohort identity cannot be verified without the embedder. \
+                     Check that [embedder].snapshot points to a valid Jina V4 HF \
+                     snapshot directory and that the GPU is available.",
                     pin_path.display()
                 ),
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/weaver-interface/src/serve.rs` around lines 77 - 82, The error message
in the bail! call in serve.rs still tells operators to "Bring
weaver-embedder.service up first" even though the embedder is now in-process;
update the bail! message (the bail!(...) invocation) to reference the new
reality—e.g., instruct to restart or start the current Weaver process with the
embedder enabled or remove the pin—so operators know to restart the main Weaver
service (or clear the pin) rather than trying to start a retired
weaver-embedder.service. Ensure you edit the string inside the bail! invocation
in serve.rs accordingly.

158-167: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Stale error message suggests retired Python backend.

The error message at lines 162-166 suggests using backend = \"python\" as an alternative, but the Python embedder backend was retired in this PR. The message should be updated to reflect that the embedder-rust feature is now required.

Suggested fix
     #[cfg(not(feature = "embedder-rust"))]
     {
         let _ = embedder_config;
         bail!(
-            "[embedder].backend = \"rust\" requires the daemon to be built with \
+            "[embedder] requires the daemon to be built with \
              the `embedder-rust` feature; rebuild with \
-             `cargo build -p weaver-interface --features inference,embedder-rust` \
-             (or omit `embedder-rust` and use `backend = \"python\"`)"
+             `cargo build -p weaver-interface --features inference,embedder-rust`"
         );
     }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/weaver-interface/src/serve.rs` around lines 158 - 167, Update the
stale bail! error message in serve.rs (the block gated by #[cfg(not(feature =
"embedder-rust"))] that references embedder_config) to remove the suggestion to
use backend = "python" and clearly state that the daemon must be built with the
embedder-rust feature; keep the rebuild command (cargo build -p weaver-interface
--features inference,embedder-rust) and ensure the message only instructs to
rebuild with embedder-rust rather than offering the retired Python backend as an
alternative.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/weaver-demo/tests/herobench_integration.rs`:
- Around line 2297-2337: The helper try_connect_embedder currently swallows all
init errors and returns None; change it to return
Result<Option<weaver_spu::encoder::client::EmbedderClient>, anyhow::Error> so
that it only returns Ok(None) when WEAVER_SPU_JINA_V4_SNAPSHOT is unset
(preserve the early-return skip), but on any other failure (invalid
WEAVER_SPU_CUDA_DEVICE, CUDA init, EmbedderClient::from_snapshot, task panic)
return Err(...) with the underlying error instead of printing and returning
None; update callers to call try_connect_embedder().await and assert/expect on
Err to fail the test, only early-return on Ok(None). Ensure references to
WEAVER_SPU_JINA_V4_SNAPSHOT, WEAVER_SPU_CUDA_DEVICE,
candle_core::Device::new_cuda, and EmbedderClient::from_snapshot are used to
locate the code to change.

---

Outside diff comments:
In `@crates/weaver-interface/src/serve.rs`:
- Around line 77-82: The error message in the bail! call in serve.rs still tells
operators to "Bring weaver-embedder.service up first" even though the embedder
is now in-process; update the bail! message (the bail!(...) invocation) to
reference the new reality—e.g., instruct to restart or start the current Weaver
process with the embedder enabled or remove the pin—so operators know to restart
the main Weaver service (or clear the pin) rather than trying to start a retired
weaver-embedder.service. Ensure you edit the string inside the bail! invocation
in serve.rs accordingly.
- Around line 158-167: Update the stale bail! error message in serve.rs (the
block gated by #[cfg(not(feature = "embedder-rust"))] that references
embedder_config) to remove the suggestion to use backend = "python" and clearly
state that the daemon must be built with the embedder-rust feature; keep the
rebuild command (cargo build -p weaver-interface --features
inference,embedder-rust) and ensure the message only instructs to rebuild with
embedder-rust rather than offering the retired Python backend as an alternative.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 631daef0-e441-4fa0-b262-e4e2d96cead0

📥 Commits

Reviewing files that changed from the base of the PR and between 043ac10 and 3d3727a.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (46)
  • crates/weaver-core/src/embedder.rs
  • crates/weaver-database/Cargo.toml
  • crates/weaver-database/build.rs
  • crates/weaver-database/src/lib.rs
  • crates/weaver-database/src/proto.rs
  • crates/weaver-database/tests/proto_types.rs
  • crates/weaver-demo/src/bin/embedder_latency.rs
  • crates/weaver-demo/src/herobench/benchmark.rs
  • crates/weaver-demo/tests/herobench_integration.rs
  • crates/weaver-demo/tests/similarity_calibration.rs
  • crates/weaver-interface/src/serve.rs
  • crates/weaver-spu/Cargo.toml
  • crates/weaver-spu/build.rs
  • crates/weaver-spu/src/decoder/multi_model.rs
  • crates/weaver-spu/src/encoder/grpc_client_legacy.rs
  • crates/weaver-spu/src/encoder/mod.rs
  • crates/weaver-spu/src/lib.rs
  • crates/weaver-spu/tests/grpc_client_legacy.rs
  • proto/persephone/common/common.proto
  • proto/persephone/embedding/embedding.proto
  • proto/persephone/extraction/extraction.proto
  • proto/persephone/training/training.proto
  • scripts/install.sh
  • services/embedder/build_proto.sh
  • services/embedder/core/__init__.py
  • services/embedder/core/cli/__init__.py
  • services/embedder/core/cli/config.py
  • services/embedder/core/config/weaver.yaml
  • services/embedder/core/embedders/__init__.py
  • services/embedder/core/embedders/embedders_base.py
  • services/embedder/core/embedders/embedders_jina.py
  • services/embedder/core/proto/__init__.py
  • services/embedder/core/proto/persephone/__init__.py
  • services/embedder/core/proto/persephone/embedding/__init__.py
  • services/embedder/core/proto/persephone/embedding/embedding_pb2.py
  • services/embedder/core/proto/persephone/embedding/embedding_pb2_grpc.py
  • services/embedder/core/services/__init__.py
  • services/embedder/core/services/embedder_grpc.py
  • services/embedder/core/services/embedder_service.py
  • services/embedder/requirements.txt
  • services/embedder/tests/__init__.py
  • services/embedder/tests/conftest.py
  • services/embedder/tests/test_embedder_grpc.py
  • services/embedder/weaver-embedder.service
  • services/weaver-infer.service
  • services/weaver.target
💤 Files with no reviewable changes (29)
  • services/embedder/core/config/weaver.yaml
  • crates/weaver-database/src/proto.rs
  • services/embedder/core/init.py
  • services/embedder/requirements.txt
  • services/embedder/weaver-embedder.service
  • proto/persephone/training/training.proto
  • services/embedder/core/proto/persephone/embedding/embedding_pb2_grpc.py
  • services/embedder/tests/conftest.py
  • crates/weaver-database/build.rs
  • crates/weaver-database/src/lib.rs
  • crates/weaver-spu/tests/grpc_client_legacy.rs
  • services/embedder/core/services/embedder_service.py
  • proto/persephone/common/common.proto
  • crates/weaver-spu/src/lib.rs
  • services/embedder/core/services/embedder_grpc.py
  • proto/persephone/extraction/extraction.proto
  • services/embedder/core/embedders/embedders_base.py
  • services/embedder/build_proto.sh
  • services/embedder/tests/test_embedder_grpc.py
  • crates/weaver-demo/src/bin/embedder_latency.rs
  • crates/weaver-demo/tests/similarity_calibration.rs
  • crates/weaver-core/src/embedder.rs
  • crates/weaver-database/Cargo.toml
  • services/embedder/core/cli/config.py
  • services/embedder/core/embedders/embedders_jina.py
  • proto/persephone/embedding/embedding.proto
  • crates/weaver-database/tests/proto_types.rs
  • services/embedder/core/proto/persephone/embedding/embedding_pb2.py
  • crates/weaver-spu/src/encoder/grpc_client_legacy.rs

Comment thread crates/weaver-demo/tests/herobench_integration.rs
…300 review)

CR round 1 on PR #300 — three valid stale-text findings, all fixed.

## 1. try_connect_embedder swallowed errors (silent test skip)

Returned `Option<EmbedderClient>` and printed-and-returned-None on
any failure (env parse, CUDA init, snapshot construct, task panic).
Net effect: a broken snapshot or busted GPU would make the
integration tests pass quietly with their work skipped. Bad signal
hygiene.

Now returns `Result<Option<EmbedderClient>>`:
  - `Ok(None)` only when `WEAVER_SPU_JINA_V4_SNAPSHOT` is unset —
    the legitimate "no test infrastructure available" skip.
  - `Err(...)` on every other failure path (env parse error, CUDA
    init failure, `EmbedderClient::from_snapshot` failure, task
    panic). Callers `.expect()` on the Result so the test fails
    loudly instead of skipping silently.

The three `integration_dedup_*` callers were updated; if the env
var is set but anything downstream is busted, the test now panics
with the specific underlying error.

## 2. Stale "Bring weaver-embedder.service up" message in serve.rs

The pin-mismatch refusal in `handle_serve` told operators to start
a service that no longer exists. Updated to point at the actual
recovery actions: check snapshot path, GPU availability, daemon
build features. Added the canonical pin-clear command for the case
where the pin is intentionally stale (corpus re-embed planned).

## 3. Stale "use backend = python" message in probe_embedder_rust

The `cfg(not(feature = "embedder-rust"))` bail offered the Python
backend as a fallback. Python is gone post-PR-1.J. Updated to just
say "rebuild with the feature enabled" — no false alternative.

Validates: `cargo check -p weaver-interface --features inference,embedder-rust` and `cargo check -p weaver-demo --features embedder-rust --tests` both clean (no new warnings).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/weaver-interface/src/serve.rs (1)

53-56: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Stale comment references retired Python service.

Line 55 mentions "a wedged Python service" but this PR retires the Python embedder entirely. The comment should reflect that the only failure modes are now snapshot/CUDA/Rust embedder issues.

Suggested fix
     // Embedder probe + pin verify runs BEFORE model loading so the daemon
     // fails fast on a missing snapshot, busted CUDA init, or a wedged
-    // Python service rather than burning seconds on `build_server_state()`
+    // embedder rather than burning seconds on `build_server_state()`
     // first. (PR `#296` review.)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/weaver-interface/src/serve.rs` around lines 53 - 56, Update the stale
comment that begins "Embedder probe + pin verify..." to remove the reference to
"a wedged Python service" and instead state that failures can come from missing
snapshot, busted CUDA init, or the Rust embedder; keep the rest of the sentence
and the reference to build_server_state() intact so readers know why the probe
runs before build_server_state().
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@crates/weaver-interface/src/serve.rs`:
- Around line 53-56: Update the stale comment that begins "Embedder probe + pin
verify..." to remove the reference to "a wedged Python service" and instead
state that failures can come from missing snapshot, busted CUDA init, or the
Rust embedder; keep the rest of the sentence and the reference to
build_server_state() intact so readers know why the probe runs before
build_server_state().

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 76a07135-b8bc-4544-bede-330d70491f2e

📥 Commits

Reviewing files that changed from the base of the PR and between 3d3727a and 6ecc293.

📒 Files selected for processing (2)
  • crates/weaver-demo/tests/herobench_integration.rs
  • crates/weaver-interface/src/serve.rs

…#300 review)

The comment justifying why the embedder probe runs before `build_server_state()` still listed "a wedged Python service" as a failure mode the early probe protects against. Python service retired in PR-1.J; the failure modes are now: missing snapshot, busted CUDA init, or Rust EmbedderClient construction failure. Comment text updated, structure preserved.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/weaver-interface/src/serve.rs (1)

178-190: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Error messages reference non-existent [embedder].backend config field.

The EmbedderConfig struct in weaver-spu/src/decoder/multi_model.rs (lines 120–168) does not have a backend field. However, these error messages reference [embedder].backend = "rust", which will confuse operators looking for a configuration option that doesn't exist.

In the embedder-rust feature context, Rust is the only embedder backend. The error messages should be updated to reflect this and remove the reference to the nonexistent field:

Proposed fix
         let embedder_config = embedder_config.ok_or_else(|| {
             anyhow::anyhow!(
-                "[embedder].backend = \"rust\" but no [embedder] block in server.toml; \
+                "No [embedder] block in server.toml; \
                  add the block with `snapshot = \"/path/to/jina-v4/snapshot/<sha>/\"`"
             )
         })?;
         let snapshot = embedder_config.snapshot.as_ref().ok_or_else(|| {
             anyhow::anyhow!(
-                "[embedder].backend = \"rust\" requires [embedder].snapshot to point at \
+                "[embedder].snapshot is required and must point at \
                  the Jina V4 HF snapshot directory (e.g. \
                  `snapshot = \"/opt/weaver/huggingface/.../snapshots/<sha>/\"`)"
             )
         })?;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/weaver-interface/src/serve.rs` around lines 178 - 190, Update the
error messages that currently reference a non-existent `[embedder].backend`
field: change the first anyhow!() (constructed for embedder_config) to refer
only to a missing `[embedder]` block and instruct operators to add `snapshot =
"/path/to/jina-v4/snapshot/<sha>/"`; change the second anyhow!() (for snapshot)
to drop the `backend = "rust"` mention and state that `[embedder].snapshot` must
point to the Jina V4 HF snapshot directory (e.g. `snapshot =
"/opt/weaver/huggingface/.../snapshots/<sha>/"`); modify the messages where
embedder_config and snapshot are created to remove any reference to
`[embedder].backend`.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@crates/weaver-interface/src/serve.rs`:
- Around line 178-190: Update the error messages that currently reference a
non-existent `[embedder].backend` field: change the first anyhow!() (constructed
for embedder_config) to refer only to a missing `[embedder]` block and instruct
operators to add `snapshot = "/path/to/jina-v4/snapshot/<sha>/"`; change the
second anyhow!() (for snapshot) to drop the `backend = "rust"` mention and state
that `[embedder].snapshot` must point to the Jina V4 HF snapshot directory (e.g.
`snapshot = "/opt/weaver/huggingface/.../snapshots/<sha>/"`); modify the
messages where embedder_config and snapshot are created to remove any reference
to `[embedder].backend`.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: deff92d2-0187-4212-9ddf-f8d5a62ff6b3

📥 Commits

Reviewing files that changed from the base of the PR and between 6ecc293 and 2eb3638.

📒 Files selected for processing (1)
  • crates/weaver-interface/src/serve.rs

…essages (PR #300 review)

Two anyhow! sites in `probe_embedder_rust` still referenced the
`[embedder].backend = "rust"` field that retired in PR-1.J along
with the `EmbedderBackend` enum. Updated to point at the actual
configuration the operator needs: presence of the `[embedder]` block
+ a snapshot path. Same recovery instructions, no false references
to removed fields.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@toddwbucy toddwbucy merged commit f44d4ac into main May 8, 2026
@toddwbucy toddwbucy deleted the feat/retire-python-embedder branch May 8, 2026 19:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants