Refactor to use Merkle{Packer,Unpacker} traits throughout#506
Refactor to use Merkle{Packer,Unpacker} traits throughout#506malcolmgreaves wants to merge 1 commit intomg/merkle_pack_implsfrom
Conversation
📝 WalkthroughSummary by CodeRabbit
WalkthroughStreaming-based merkle node transport replaces in-memory tar+gzip buffers for uploads and downloads. Client-side Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant PackTask as Pack (spawn_blocking)
participant Duplex as DuplexWriter
participant HTTP as HTTP POST
participant Server
participant UnpackTask as Unpack (spawn_blocking)
participant Merkle as MerkleStore
Client->>PackTask: request pack_nodes / pack_all
PackTask->>Duplex: write tar.gz bytes into duplex writer
Duplex->>HTTP: stream body chunks (ReaderStream)
HTTP->>Server: POST streamed tar.gz
Server-->>HTTP: response (streamed body)
HTTP->>UnpackTask: bridge response to sync reader
UnpackTask->>Merkle: merkle_store().unpack(reader, UnpackOptions)
Merkle-->>UnpackTask: unpack result (hashes / error)
UnpackTask-->>Client: return result or propagate error
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Review rate limit: 2/5 reviews remaining, refill in 26 minutes and 11 seconds. Comment |
|
STACKED PR: Do not merge until #504 has been merged. |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/lib/src/core/db/merkle_node/file_backend.rs (1)
349-373:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winReject absolute and prefixed tar paths before
join.This guard only blocks
... An entry like/tmp/evil(or a Windows drive/prefix path) will causejointo drop the.oxen/...prefix and write outside the repo. The laterextract_hash_from_entry_pathcheck happens afterfile.unpack, so the escape already occurred.Suggested fix
let mut file = entry.map_err(MerkleDbError::CannotReadMerkle)?; let path = file.path()?.into_owned(); - // Path-traversal guard: refuse any entry whose path resolves above its container. - if path.components().any(|c| matches!(c, Component::ParentDir)) { + // Path-traversal guard: refuse any entry that can escape its container. + if path.components().any(|c| { + matches!( + c, + Component::ParentDir | Component::RootDir | Component::Prefix(_) + ) + }) || path.is_absolute() + { return Err(MerkleDbError::PathTraversal(path.display().to_string())); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/lib/src/core/db/merkle_node/file_backend.rs` around lines 349 - 373, Reject absolute/prefixed paths before performing any join/unpack: in the block that currently inspects path (the local variable `path` in file_backend.rs) add a guard that returns `MerkleDbError::PathTraversal` if `path.is_absolute()` or if any component matches `Component::RootDir` or `Component::Prefix(_)` (to cover Windows drive prefixes), in addition to the existing ParentDir check; ensure this check is executed before computing `dst_path`, before `file.unpack` and before calling `extract_hash_from_entry_path`, so `oxen_hidden.join(...)` cannot be bypassed by absolute/prefixed entries (refer to `path`, `dst_path`, `tree_nodes_prefix`, `oxen_hidden`, and `extract_hash_from_entry_path`).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@crates/lib/src/core/db/merkle_node/file_backend.rs`:
- Around line 349-373: Reject absolute/prefixed paths before performing any
join/unpack: in the block that currently inspects path (the local variable
`path` in file_backend.rs) add a guard that returns
`MerkleDbError::PathTraversal` if `path.is_absolute()` or if any component
matches `Component::RootDir` or `Component::Prefix(_)` (to cover Windows drive
prefixes), in addition to the existing ParentDir check; ensure this check is
executed before computing `dst_path`, before `file.unpack` and before calling
`extract_hash_from_entry_path`, so `oxen_hidden.join(...)` cannot be bypassed by
absolute/prefixed entries (refer to `path`, `dst_path`, `tree_nodes_prefix`,
`oxen_hidden`, and `extract_hash_from_entry_path`).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 7131f0d1-9ef9-4c07-bb2b-441382028cb4
📒 Files selected for processing (9)
crates/lib/src/api/client/tree.rscrates/lib/src/core/db/merkle_node/file_backend.rscrates/lib/src/core/progress/push_progress.rscrates/lib/src/core/progress/sync_progress.rscrates/lib/src/model/merkle_tree.rscrates/lib/src/model/merkle_tree/merkle_transport.rscrates/lib/src/model/repository/local_repository.rscrates/lib/src/model/repository/local_repository/merkle_store_dispatch.rscrates/lib/src/repositories/tree.rs
💤 Files with no reviewable changes (1)
- crates/lib/src/model/merkle_tree/merkle_transport.rs
f7406e4 to
7d78e5c
Compare
Refactors oxen's Merkle tree node transport between clients and servers to use the new `MerklePacker` and `MerkleUnpacker` traits. The `LocalRepository::merkle_store()` method has been updated to return a `impl TransportableMerkleStore + '_`. And the private `merkle_store_dispatch::StoreEnum` has been updated to include dispatch to the `MerkleTransport`'s methods.
8a84c72 to
91b63cb
Compare
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/lib/src/core/db/merkle_node/file_backend.rs (1)
355-373:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winReject absolute and prefixed tar paths here.
The new guard only blocks
... An entry like/tmp/pwnedorC:\temp\pwnedstill passes, andPathBuf::joinwill ignoreoxen_hidden, sofile.unpack()can write outside the repo before the later structure check runs.Suggested hardening
- // Path-traversal guard: refuse any entry whose path resolves above its container. - if path.components().any(|c| matches!(c, Component::ParentDir)) { + // Path-traversal guard: refuse any entry that can escape its container. + if path.components().any(|c| { + matches!( + c, + Component::ParentDir | Component::RootDir | Component::Prefix(_) + ) + }) { return Err(MerkleDbError::PathTraversal(path.display().to_string())); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/lib/src/core/db/merkle_node/file_backend.rs` around lines 355 - 373, The current path-traversal guard only rejects ParentDir (..), but absolute or prefixed archive paths (e.g. starting with Component::RootDir or Component::Prefix like "C:") still pass and will cause PathBuf::join to ignore oxen_hidden; update the validation in the unpack path by rejecting any path whose components contain Component::RootDir or Component::Prefix (in addition to Component::ParentDir) or where path.is_absolute() is true, returning MerkleDbError::PathTraversal (or same error type) for those cases before computing dst_path; keep the existing checks for entry_type and the existing dst_path logic otherwise so server-style tree_nodes_prefix handling remains unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@crates/lib/src/core/db/merkle_node/file_backend.rs`:
- Around line 355-373: The current path-traversal guard only rejects ParentDir
(..), but absolute or prefixed archive paths (e.g. starting with
Component::RootDir or Component::Prefix like "C:") still pass and will cause
PathBuf::join to ignore oxen_hidden; update the validation in the unpack path by
rejecting any path whose components contain Component::RootDir or
Component::Prefix (in addition to Component::ParentDir) or where
path.is_absolute() is true, returning MerkleDbError::PathTraversal (or same
error type) for those cases before computing dst_path; keep the existing checks
for entry_type and the existing dst_path logic otherwise so server-style
tree_nodes_prefix handling remains unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 7a98c6c0-1797-4a1f-bddc-d2f99e356f0d
📒 Files selected for processing (9)
crates/lib/src/api/client/tree.rscrates/lib/src/core/db/merkle_node/file_backend.rscrates/lib/src/core/progress/push_progress.rscrates/lib/src/core/progress/sync_progress.rscrates/lib/src/model/merkle_tree.rscrates/lib/src/model/merkle_tree/merkle_transport.rscrates/lib/src/model/repository/local_repository.rscrates/lib/src/model/repository/local_repository/merkle_store_dispatch.rscrates/lib/src/repositories/tree.rs
💤 Files with no reviewable changes (1)
- crates/lib/src/model/merkle_tree/merkle_transport.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- crates/lib/src/model/merkle_tree.rs
Refactors oxen's Merkle tree node transport between clients
and servers to use the new
MerklePackerandMerkleUnpackertraits. The
LocalRepository::merkle_store()method has beenupdated to return a
impl TransportableMerkleStore + '_. Andthe private
merkle_store_dispatch::StoreEnumhas been updatedto include dispatch to the
MerkleTransport's methods.