MerkleTransport trait for client <> server#514
MerkleTransport trait for client <> server#514malcolmgreaves wants to merge 1 commit intomg/merkle_dyn_use_file_backendfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a merkle-tree wire-transport abstraction: new ChangesMerkle Tree Transport Abstraction
Sequence DiagramsequenceDiagram
participant Client as Client
participant Packer as MerklePacker
participant Transport as Stream
participant Unpacker as MerkleUnpacker
participant Store as MerkleStore
Client->>Packer: request pack_nodes(hashes, PackOptions)
Packer->>Transport: write tar-gz stream (nodes)
Transport->>Unpacker: provide stream
Unpacker->>Store: install nodes (UnpackOptions)
Unpacker-->>Client: return HashSet<MerkleHash>
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
291ef08 to
844e985
Compare
b38fcd2 to
c5c8270
Compare
844e985 to
460483e
Compare
c5c8270 to
f12af68
Compare
f9502a5 to
7ecfb7f
Compare
c65ffd7 to
5813fbf
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/lib/src/model/merkle_tree/merkle_transport.rs`:
- Around line 60-71: The trait doc for pack_nodes must require a deterministic,
stable ordering when emitting entries because hashes is a HashSet; update the
doc for fn pack_nodes (and mention PackOptions) to state implementations must
sort the provided HashSet<MerkleHash> into a defined order (e.g., lexicographic
on the MerkleHash bytes or hex) before writing to out: &mut dyn Write so the
tar-gz wire output is stable across runs; also update any existing
implementations to perform that sort when iterating and packing, and mention
OxenError remains the error type for failures.
- Around line 91-101: The unpack contract (fn unpack(&self, reader: &mut dyn
Read, opts: UnpackOptions) -> Result<HashSet<MerkleHash>, OxenError>) must
explicitly require archive-safety: update the doc comment to state that
implementations MUST reject or fail on unsafe tar entries (absolute paths, any
path containing .. that would escape the intended extraction root, and symlinks
that point outside the target store) and MUST sanitize relative paths and
enforce extraction into the store root; then update all implementations to
validate each tar header/path, canonicalize and check it remains within the
target store root, and return an Err(OxenError) when encountering an unsafe
entry rather than writing outside the store (reference UnpackOptions, MerkleHash
and OxenError in the doc so callers and implementers know the expected
behavior).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 668c28ee-ced8-4a14-9554-c98f5d0fa598
📒 Files selected for processing (2)
crates/lib/src/model/merkle_tree.rscrates/lib/src/model/merkle_tree/merkle_transport.rs
✅ Files skipped from review due to trivial changes (1)
- crates/lib/src/model/merkle_tree.rs
| /// Pack the given node `hashes` into `out` as a tar-gz stream, in the layout | ||
| /// selected by `opts`. | ||
| /// | ||
| /// Hashes not present in the store are silently skipped, and an empty `hashes` | ||
| /// produces a valid but empty tarball. See [`PackOptions`] for the per-variant | ||
| /// wire-format details. | ||
| fn pack_nodes( | ||
| &self, | ||
| hashes: &HashSet<MerkleHash>, | ||
| opts: PackOptions, | ||
| out: &mut dyn Write, | ||
| ) -> Result<(), OxenError>; |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win
Specify deterministic packing order for stable wire output.
Because input is a HashSet, iteration order is non-deterministic unless implementations sort before emitting entries. Please codify deterministic ordering in the trait docs to avoid unstable tar-gz bytes across runs/processes.
Suggested doc-level clarification
/// Pack the given node `hashes` into `out` as a tar-gz stream, in the layout
/// selected by `opts`.
+ ///
+ /// Ordering contract: output must be deterministic for the same logical input
+ /// set (e.g., sort hashes before emitting) rather than relying on `HashSet`
+ /// iteration order.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| /// Pack the given node `hashes` into `out` as a tar-gz stream, in the layout | |
| /// selected by `opts`. | |
| /// | |
| /// Hashes not present in the store are silently skipped, and an empty `hashes` | |
| /// produces a valid but empty tarball. See [`PackOptions`] for the per-variant | |
| /// wire-format details. | |
| fn pack_nodes( | |
| &self, | |
| hashes: &HashSet<MerkleHash>, | |
| opts: PackOptions, | |
| out: &mut dyn Write, | |
| ) -> Result<(), OxenError>; | |
| /// Pack the given node `hashes` into `out` as a tar-gz stream, in the layout | |
| /// selected by `opts`. | |
| /// | |
| /// Ordering contract: output must be deterministic for the same logical input | |
| /// set (e.g., sort hashes before emitting) rather than relying on `HashSet` | |
| /// iteration order. | |
| /// | |
| /// Hashes not present in the store are silently skipped, and an empty `hashes` | |
| /// produces a valid but empty tarball. See [`PackOptions`] for the per-variant | |
| /// wire-format details. | |
| fn pack_nodes( | |
| &self, | |
| hashes: &HashSet<MerkleHash>, | |
| opts: PackOptions, | |
| out: &mut dyn Write, | |
| ) -> Result<(), OxenError>; |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/lib/src/model/merkle_tree/merkle_transport.rs` around lines 60 - 71,
The trait doc for pack_nodes must require a deterministic, stable ordering when
emitting entries because hashes is a HashSet; update the doc for fn pack_nodes
(and mention PackOptions) to state implementations must sort the provided
HashSet<MerkleHash> into a defined order (e.g., lexicographic on the MerkleHash
bytes or hex) before writing to out: &mut dyn Write so the tar-gz wire output is
stable across runs; also update any existing implementations to perform that
sort when iterating and packing, and mention OxenError remains the error type
for failures.
| /// Unpack the tar-gz stream from `reader` into the store, applying the existing-file | ||
| /// policy in `opts`. | ||
| /// | ||
| /// Returns the set of hashes parsed from the tarball (not necessarily only those | ||
| /// newly installed — entries skipped per [`UnpackOptions::SkipExisting`] still appear | ||
| /// in the result, matching `main`'s `repositories::tree::unpack_nodes` behaviour). | ||
| fn unpack( | ||
| &self, | ||
| reader: &mut dyn Read, | ||
| opts: UnpackOptions, | ||
| ) -> Result<HashSet<MerkleHash>, OxenError>; |
There was a problem hiding this comment.
Define archive-safety requirements in the unpack contract.
unpack installs filesystem-backed content but the trait contract doesn’t require rejecting unsafe tar entries (absolute paths, .. traversal, symlink escapes). Please make this explicit so all implementations enforce the same security baseline.
Suggested doc-level contract addition
pub trait MerkleUnpacker: Send + Sync {
/// Unpack the tar-gz stream from `reader` into the store, applying the existing-file
/// policy in `opts`.
+ ///
+ /// Security contract:
+ /// - Reject entries that resolve outside the intended node root (absolute paths,
+ /// `..` traversal, or symlink-based escapes).
+ /// - Reject malformed archive entries whose on-wire path cannot be mapped to a
+ /// valid `MerkleHash`.
///
/// Returns the set of hashes parsed from the tarball (not necessarily only those🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/lib/src/model/merkle_tree/merkle_transport.rs` around lines 91 - 101,
The unpack contract (fn unpack(&self, reader: &mut dyn Read, opts:
UnpackOptions) -> Result<HashSet<MerkleHash>, OxenError>) must explicitly
require archive-safety: update the doc comment to state that implementations
MUST reject or fail on unsafe tar entries (absolute paths, any path containing
.. that would escape the intended extraction root, and symlinks that point
outside the target store) and MUST sanitize relative paths and enforce
extraction into the store root; then update all implementations to validate each
tar header/path, canonicalize and check it remains within the target store root,
and return an Err(OxenError) when encountering an unsafe entry rather than
writing outside the store (reference UnpackOptions, MerkleHash and OxenError in
the doc so callers and implementers know the expected behavior).
|
NOTE: Stacked PR! Must merge #513 before merging. |
b36f043 to
30c2b85
Compare
5813fbf to
11eacec
Compare
30c2b85 to
4f65285
Compare
11eacec to
d4d0762
Compare
4f65285 to
3239a5d
Compare
d4d0762 to
234e31d
Compare
3239a5d to
e74a7e6
Compare
234e31d to
029e820
Compare
e74a7e6 to
5070203
Compare
5070203 to
ba031b3
Compare
2bd348f to
cbf9b0a
Compare
ba031b3 to
ff359e0
Compare
cbf9b0a to
df8d6fb
Compare
ff359e0 to
d152216
Compare
Adds new traits for defining how `MerkleStore`s can encode Merkle tree nodes to send and receive between the `oxen-cli` client and `oxen-server`. The `MerklePacker` trait collects a subset of nodes and allows a writer to encode them. The `MerkleUnpacker` trait uses a provided reader to decode Merkle tree nodes and update the underlying physical store. Provided that they have the same `Error` type, combining these two traits is a `MerkleTransport`. A blanket `impl` for any such unified traits is included here too. Also defines a `TransportableMerkleStore`, which is a `MerkleStore` + a `MerkleTransport`. A blanket impl. is provided to unify, so long as the type's implementation's `Error`s are all the same.
df8d6fb to
f7950a6
Compare
Adds new traits for defining how
MerkleStores can encode Merkle treenodes to send and receive between the
oxen-cliclient andoxen-server.The
MerklePackertrait collects a subset of nodes and allows a writer toencode them.
The
MerkleUnpackertrait uses a provided reader to decode Merkle treenodes and update the underlying physical store. Provided that they have the
same
Errortype, combining these two traits is aMerkleTransport. Ablanket
implfor any such unified traits is included here too.Also defines a
TransportableMerkleStore, which is aMerkleStore+ aMerkleTransport. A blanket impl. is provided to unify, so long as thetype's implementation's
Errors are all the same.