refactor(spu): delete weaver-inference + weaver-embedding shell crates (PR-0.5.E)#278
Conversation
…s (PR-0.5.E)
Per docs/specs/weaver-spu-Spec.md §10 PR-0.5.E. Final Phase 0.5 PR.
Removes the deprecated re-export shells and consolidates the
workspace from 9 → 7 crates per spec §6.
What's deleted:
- crates/weaver-inference/ (entire directory; was a 17-export
deprecated shell after PR-0.5.C)
- crates/weaver-embedding/ (entire directory; was a deprecated
shell after PR-0.5.B + PR-0.5.D)
Workspace Cargo.toml:
- Members list contracted from 9 to 7.
- weaver-embedding + weaver-inference workspace deps removed
(replaced by a comment pointing at the consolidation history).
Consumer-side import rewrites (bulk sed across non-shell crates):
- weaver_inference::backend → weaver_spu::decoder::llama_cpp_backend
- weaver_inference::{engine,family,gguf,multi_model,model,
model_profile,config,download,runtime_config,
startup,suggest,server}
→ weaver_spu::decoder::*
- weaver_inference::{vram,nvlink,probe,gpu_orchestrator,gpu}
→ weaver_spu::core::*
- weaver_embedding::{Embedder,EmbedderInfo}
→ weaver_core::embedder::*
- weaver_embedding::{late_chunk,LateChunkConfig,LateChunkResult,
late_chunk_embeddings}
→ weaver_spu::encoder::*
- weaver_embedding::pin → weaver_spu::core::pin
- weaver_embedding::grpc_client → weaver_spu::encoder::grpc_client_legacy
- weaver_embedding::gguf_backend → weaver_spu::encoder::gguf_backend
- weaver_embedding::proto → weaver_spu::proto
Cargo.toml updates:
- weaver-interface: dropped optional `weaver-inference` + direct
`weaver-embedding` deps; switched to `weaver-spu` dep with
`inference` feature now forwarding `weaver-spu/cuda` +
`weaver-spu/gguf`.
- weaver-demo: dropped `weaver-embedding` dep, added `weaver-spu`.
Out-of-scope clippy suppression (pre-existing in moved code):
- crates/weaver-spu/src/decoder/server/prompt.rs +
crates/weaver-demo/src/bin/embedder_latency.rs gain
`#![allow(elided_lifetimes_in_paths)]` at file scope. The
hidden-lifetime patterns date back to weaver-inference's pre-deny
state; fixing them is a future cleanup PR. One bounded clippy fix
in embedder_latency.rs (.iter().nth(2) → .get(2)) was also made
inline since it was a one-line change.
Acceptance criteria from spec §10 PR-0.5.E:
- cargo build --workspace clean ✓
- cargo test --workspace clean (all tests pass — 588 weaver-core,
110 weaver-database, plus the moved weaver-spu integration tests
for grpc_client_legacy and mutex_resilience) ✓
- cargo clippy --workspace --all-targets -- -D warnings clean ✓
- Workspace member count = 7 (verified via cargo metadata):
weaver-analysis, weaver-core, weaver-database, weaver-demo,
weaver-interface, weaver-spu, weaver-trace ✓
Phase 0.5 (crate consolidation) is complete. Phase 1 (encoder
candle port) commits next per spec §10 sequencing.
Dependencies: PR-0.5.A, PR-0.5.B, PR-0.5.C, PR-0.5.D (all merged).
Companion upstream PR: none.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughThis PR removes the deprecated weaver-embedding and weaver-inference crate shells, folds consumers to weaver_spu, updates workspace members/dependencies, migrates import paths across crates, and adds a new embedder latency benchmark binary. ChangesWorkspace & Crate Consolidation
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
crates/weaver-spu/src/encoder/gguf_backend.rs (1)
237-247:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winStale test-module comment still describes removed crates
Lines 237-245 still document
weaver_inference/weaver_embeddingdependency flow, but Line 247 now imports fromweaver_spu::decoder::gguf. Please update or remove that block to avoid misleading future edits.Suggested comment update
- // Tests load a real GGUF via `weaver_inference`'s decoder - // model-loader because that's where `init_backend` / `load_model` - // / `GgufModelParams` live (per the post-Phase-0 ontology: - // model-runtime primitives stay in `weaver_inference`; - // embedding-pipeline composition lives here in - // `weaver_embedding`). `weaver-inference` is therefore a - // dev-dependency of `weaver-embedding`. A later PR (Phase 1 - // — the `EmbedderClient` impl) makes it a regular dependency - // when the production embedder code calls `load_model` itself. + // Tests load a real GGUF via `weaver_spu::decoder::gguf`, + // which exposes `init_backend`, `load_model`, and `GgufModelParams`.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/weaver-spu/src/encoder/gguf_backend.rs` around lines 237 - 247, The block comment describing `weaver_inference`/`weaver_embedding` dependency flow is stale now that the code imports directly from weaver_spu (see the `use std::sync::Mutex;` and `use weaver_spu::decoder::gguf::{GgufModelParams, init_backend, load_model};` lines); remove or replace the paragraph with a brief, accurate note explaining current dependency/usage (or delete it entirely) so it no longer references removed crates (`weaver_inference`/`weaver_embedding`) and reflects that `GgufModelParams`, `init_backend`, and `load_model` are imported from `weaver_spu::decoder::gguf`.crates/weaver-interface/src/lib.rs (1)
9-13:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winUnify the pin-path migration comment block.
Line 13 points to
weaver_spu::core::pin, but Lines 9-12 still referenceweaver-embedding::pin. Keep one canonical path to avoid confusion.💡 Suggested comment cleanup
-// `embedder_pin` moved to `weaver-embedding::pin` per -// `embedder-oxidization-Spec.md` §5.1 (issue `#166` / sprint -// Block A.4). Cohort-pin is an embedding-substrate concern, not -// a CLI concern; it lives alongside the loader paths it guards. -// External consumers import from `weaver_spu::core::pin` now. +// `embedder_pin` moved to `weaver_spu::core::pin` per +// `embedder-oxidization-Spec.md` §5.1 (issue `#166` / sprint +// Block A.4). Cohort-pin is an embedding-substrate concern, not +// a CLI concern; it lives alongside the loader paths it guards. +// External consumers import from `weaver_spu::core::pin`.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/weaver-interface/src/lib.rs` around lines 9 - 13, Update the migration comment to use one canonical import path: replace the reference to weaver-embedding::pin with the canonical weaver_spu::core::pin mentioned on line 13 and remove the duplicate/contradictory phrasing around embedder_pin; keep the note that embedder_pin moved per the Spec and that cohort-pin is an embedding-substrate concern, but reference only weaver_spu::core::pin (mentioning embedder_pin, weaver-embedding::pin, and weaver_spu::core::pin to locate the text).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/weaver-core/src/embedder.rs`:
- Around line 154-156: Update the doc comments in this file to remove references
to the removed weaver-embedding crate and point them to the new symbol
weaver_spu::encoder::grpc_client_legacy; specifically search for any lingering
"weaver-embedding" paths (noted around the earlier comments at the previous Line
10, Line 42, Line 110) and replace them with
weaver_spu::encoder::grpc_client_legacy so the two bulleted items (and any other
mentions) consistently refer to the new gRPC client symbol rather than the
deleted crate.
---
Outside diff comments:
In `@crates/weaver-interface/src/lib.rs`:
- Around line 9-13: Update the migration comment to use one canonical import
path: replace the reference to weaver-embedding::pin with the canonical
weaver_spu::core::pin mentioned on line 13 and remove the
duplicate/contradictory phrasing around embedder_pin; keep the note that
embedder_pin moved per the Spec and that cohort-pin is an embedding-substrate
concern, but reference only weaver_spu::core::pin (mentioning embedder_pin,
weaver-embedding::pin, and weaver_spu::core::pin to locate the text).
In `@crates/weaver-spu/src/encoder/gguf_backend.rs`:
- Around line 237-247: The block comment describing
`weaver_inference`/`weaver_embedding` dependency flow is stale now that the code
imports directly from weaver_spu (see the `use std::sync::Mutex;` and `use
weaver_spu::decoder::gguf::{GgufModelParams, init_backend, load_model};` lines);
remove or replace the paragraph with a brief, accurate note explaining current
dependency/usage (or delete it entirely) so it no longer references removed
crates (`weaver_inference`/`weaver_embedding`) and reflects that
`GgufModelParams`, `init_backend`, and `load_model` are imported from
`weaver_spu::decoder::gguf`.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 97ed7667-a93a-4ddf-88af-3c1eb5ae6466
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (24)
Cargo.tomlcrates/weaver-core/src/embedder.rscrates/weaver-database/src/chunking/mod.rscrates/weaver-database/src/lib.rscrates/weaver-demo/Cargo.tomlcrates/weaver-demo/src/bin/embedder_latency.rscrates/weaver-demo/src/herobench/belief.rscrates/weaver-demo/src/herobench/benchmark.rscrates/weaver-demo/tests/herobench_integration.rscrates/weaver-demo/tests/similarity_calibration.rscrates/weaver-embedding/Cargo.tomlcrates/weaver-embedding/src/lib.rscrates/weaver-inference/Cargo.tomlcrates/weaver-inference/examples/inference.tomlcrates/weaver-inference/kernels/transformer.cucrates/weaver-inference/src/lib.rscrates/weaver-interface/Cargo.tomlcrates/weaver-interface/src/harness.rscrates/weaver-interface/src/lib.rscrates/weaver-interface/src/model.rscrates/weaver-interface/src/serve.rscrates/weaver-interface/src/toml_edit.rscrates/weaver-spu/src/decoder/server/prompt.rscrates/weaver-spu/src/encoder/gguf_backend.rs
💤 Files with no reviewable changes (6)
- crates/weaver-embedding/Cargo.toml
- crates/weaver-inference/Cargo.toml
- crates/weaver-embedding/src/lib.rs
- crates/weaver-inference/examples/inference.toml
- crates/weaver-inference/kernels/transformer.cu
- crates/weaver-inference/src/lib.rs
…g doc refs Three CR findings on PR #278 commit 1d36b3c, all valid (doc-only): 1. weaver-core/src/embedder.rs (lines 10, 42, 110, 133, 138): five mentions of `weaver-embedding::grpc_client` / `weaver-embedding::grpc_client::EmbeddingClient` / "gRPC client in weaver-embedding" rewritten to point at `weaver_spu::encoder::grpc_client_legacy` (the post-PR-0.5.D home). The "tonic dep removed when last consumer goes away" note now correctly names PR-3.A. 2. weaver-interface/src/lib.rs (lines 9-13): the embedder_pin migration comment had two stale destinations (weaver-embedding::pin and weaver_spu::core::pin). Reduced to the canonical weaver_spu::core::pin path with a one-line reference to PR-0.5.D as the relocation event. 3. weaver-spu/src/encoder/gguf_backend.rs tests block comment: the paragraph described the test cross-crate dev-dep wiring to weaver_inference. Now that both halves live in weaver_spu, the import is a sibling-module use; comment replaced with a brief accurate note pointing at weaver_spu::decoder::gguf. Validated: cargo check --workspace clean.
Summary
Per
docs/specs/weaver-spu-Spec.md§10 PR-0.5.E. Final Phase 0.5 PR. Removes the deprecated re-export shells and consolidates the workspace from 9 → 7 crates per spec §6.What's deleted
crates/weaver-inference/(entire directory; was a 17-export deprecated shell after PR-0.5.C)crates/weaver-embedding/(entire directory; was a deprecated shell after PR-0.5.B + PR-0.5.D)Workspace Cargo.toml
Members list contracted from 9 → 7. Workspace deps removed (replaced with a comment pointing at the consolidation history).
Consumer-side import rewrites (bulk sed)
weaver_inference::backendweaver_spu::decoder::llama_cpp_backendweaver_inference::{engine,family,gguf,multi_model,model,model_profile,config,download,runtime_config,startup,suggest,server}weaver_spu::decoder::*weaver_inference::{vram,nvlink,probe,gpu_orchestrator,gpu}weaver_spu::core::*weaver_embedding::{Embedder,EmbedderInfo}weaver_core::embedder::*weaver_embedding::{late_chunk,LateChunkConfig,LateChunkResult,late_chunk_embeddings}weaver_spu::encoder::*weaver_embedding::pinweaver_spu::core::pinweaver_embedding::grpc_clientweaver_spu::encoder::grpc_client_legacyweaver_embedding::gguf_backendweaver_spu::encoder::gguf_backendweaver_embedding::protoweaver_spu::protoCargo.toml updates
weaver-interface: dropped optionalweaver-inference+ directweaver-embeddingdeps; switched toweaver-spudep withinferencefeature now forwardingweaver-spu/cuda+weaver-spu/gguf.weaver-demo: droppedweaver-embeddingdep, addedweaver-spu.Out-of-scope clippy suppression
weaver-spu/src/decoder/server/prompt.rs+weaver-demo/src/bin/embedder_latency.rsgain#![allow(elided_lifetimes_in_paths)]at file scope. The hidden-lifetime patterns are pre-existing in moved code; fixing them is a future cleanup PR. One bounded clippy fix inembedder_latency.rs(.iter().nth(2)→.get(2)) was also made inline.Test plan
cargo build --workspacecleancargo test --workspaceclean — all tests pass (588 weaver-core, 110 weaver-database, plus moved weaver-spu integration testsgrpc_client_legacyandmutex_resilience)cargo clippy --workspace --all-targets -- -D warningscleancargo metadata):weaver-analysis, weaver-core, weaver-database, weaver-demo, weaver-interface, weaver-spu, weaver-traceSequencing
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Refactor
Documentation