Make TMerkleTreeNode dyn compatible + rm from_u8 panic!#406
Make TMerkleTreeNode dyn compatible + rm from_u8 panic!#406malcolmgreaves merged 11 commits intomainfrom
TMerkleTreeNode dyn compatible + rm from_u8 panic!#406Conversation
- modified trait to use associated type for error - to_msgpack_bytes returns result w/ that error or vec<u8> - blanket impl uses something that also has serialize as to_msgpack_bytes implmentation - made `from_u8` return error & added it to the OxenError heirarchy WIP propigate changes
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughRefactors Merkle node serialization and error handling: makes node-type parsing fallible, replaces serde panics with propagated rmp_serde encode/decode Results, adds an object-safe Changes
Sequence Diagram(s)sequenceDiagram
rect rgba(200,200,255,0.5)
participant Repo
participant p_write_tree
participant Node
participant MerkleNodeDB
participant Storage
end
Repo->>p_write_tree: write tree (root commit)
p_write_tree->>Node: request to_msgpack_bytes()
Node-->>p_write_tree: Vec<u8> or EncodeError
p_write_tree->>MerkleNodeDB: write_node(bytes)
MerkleNodeDB->>Storage: persist bytes
MerkleNodeDB-->>p_write_tree: Ok or Decode/IO Error
p_write_tree-->>Repo: final Result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
crates/lib/src/error.rs (1)
315-317: Consider addingRmpEncodeErrorto the internal error hints.The
RmpDecodeErroris included in thehint()method (Line 381) as an internal error, but the newRmpEncodeErroris not. For consistency, encoding errors should likely receive the same hint.🔧 Suggested fix
DB(_) | ArrowError(_) | BinCodeError(_) | RedisError(_) | R2D2Error(_) - | RmpDecodeError(_) => { + | RmpDecodeError(_) | RmpEncodeError(_) => { "This is an internal error. Run with RUST_LOG=debug for more details." }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/lib/src/error.rs` around lines 315 - 317, Add RmpEncodeError to the internal error hints returned by the hint() method so encoding errors get the same diagnostic guidance as decoding errors; locate the hint() implementation and where it currently matches RmpDecodeError and include RmpEncodeError in that same arm (or add a separate arm that returns the same internal error hint) so both rmp_serde::encode::Error and rmp_serde::decode::Error map to the internal error hint.crates/lib/src/repositories/tree.rs (1)
1126-1128: Consider returning an error instead of panicking.The
panic!on unexpected node types could be replaced with anErrreturn for more graceful error handling, though this represents a programming error rather than a runtime condition.🔧 Optional fix
EMerkleTreeNode::File(file_node) => { db.add_child(file_node)?; } - node => { - panic!("p_write_tree Unexpected node type: {node:?}"); - } + EMerkleTreeNode::Commit(_) | EMerkleTreeNode::FileChunk(_) => { + return Err(OxenError::basic_str(format!( + "p_write_tree unexpected node type: {:?}", + child.node.node_type() + ))); + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/lib/src/repositories/tree.rs` around lines 1126 - 1128, Replace the panic in the catch-all match arm inside p_write_tree with a propagated error return: instead of panic!("p_write_tree Unexpected node type: {node:?}"), return an Err variant (using the crate's repository error type or anyhow::Error) that includes a descriptive message and the debug of node so callers can handle it; update the function signature to return Result if needed and adjust call sites to propagate the error via ? or map_err so this programming-error case is reported without aborting the process.crates/lib/src/core/db/merkle_node/merkle_node_db.rs (1)
257-264:?Sizedbounds needed only if trait objects are intended for these entry points.The current code's implicit
Sizedbounds onNwould reject&dyn TMerkleTreeNode<...>at these call sites. However, all current callers inp_write_treepass concrete types (vnode,dir_node,file_node), so this is not a current issue.If the design goal is to allow trait-object dispatch through these helpers, add
?Sized:Suggested changes (if needed)
- pub fn open_read_write_if_not_exists<N: TMerkleTreeNode>( + pub fn open_read_write_if_not_exists<N: TMerkleTreeNode + ?Sized>(- pub fn open_read_write<N: TMerkleTreeNode>( + pub fn open_read_write<N: TMerkleTreeNode + ?Sized>(- fn write_node<N: TMerkleTreeNode>( + fn write_node<N: TMerkleTreeNode + ?Sized>(- pub fn add_child<N: TMerkleTreeNode>(&mut self, item: &N) -> Result<(), OxenError> + pub fn add_child<N: TMerkleTreeNode + ?Sized>(&mut self, item: &N) -> Result<(), OxenError>Applies to: lines 257–264, 277–284, 373–380, 424–427.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/lib/src/core/db/merkle_node/merkle_node_db.rs` around lines 257 - 264, Update the generic bounds to allow trait-object dispatch by adding ?Sized to the N type parameter where needed: change the signatures of open_read_write_if_not_exists, (and the other helper functions flagged in the review at the same pattern) so the generic bound becomes N: TMerkleTreeNode + ?Sized and keep the parameter as node: &N; leave the existing where clauses (e.g., OxenError: From<N::SerializationError>) intact so SerializationError still resolves for unsized types.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/lib/src/core/db/merkle_node/merkle_node_db.rs`:
- Around line 331-334: The code currently uses
MerkleTreeNodeType::from_u8_unwrap on bytes read from disk (e.g., in the lookup
mapping shown) which will panic on invalid bytes; change these sites (the
occurrence in the lookup mapping and the similar uses at the other noted
locations) to call the fallible MerkleTreeNodeType::from_u8 and propagate the
error instead of unwrapping so open()/map() return Err for invalid node-type
bytes; update the surrounding logic in the functions handling disk reads (e.g.,
open(), map()) to propagate the Result from from_u8 rather than assuming a valid
value.
---
Nitpick comments:
In `@crates/lib/src/core/db/merkle_node/merkle_node_db.rs`:
- Around line 257-264: Update the generic bounds to allow trait-object dispatch
by adding ?Sized to the N type parameter where needed: change the signatures of
open_read_write_if_not_exists, (and the other helper functions flagged in the
review at the same pattern) so the generic bound becomes N: TMerkleTreeNode +
?Sized and keep the parameter as node: &N; leave the existing where clauses
(e.g., OxenError: From<N::SerializationError>) intact so SerializationError
still resolves for unsized types.
In `@crates/lib/src/error.rs`:
- Around line 315-317: Add RmpEncodeError to the internal error hints returned
by the hint() method so encoding errors get the same diagnostic guidance as
decoding errors; locate the hint() implementation and where it currently matches
RmpDecodeError and include RmpEncodeError in that same arm (or add a separate
arm that returns the same internal error hint) so both rmp_serde::encode::Error
and rmp_serde::decode::Error map to the internal error hint.
In `@crates/lib/src/repositories/tree.rs`:
- Around line 1126-1128: Replace the panic in the catch-all match arm inside
p_write_tree with a propagated error return: instead of panic!("p_write_tree
Unexpected node type: {node:?}"), return an Err variant (using the crate's
repository error type or anyhow::Error) that includes a descriptive message and
the debug of node so callers can handle it; update the function signature to
return Result if needed and adjust call sites to propagate the error via ? or
map_err so this programming-error case is reported without aborting the process.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 823320ab-2b3e-4dfe-81e9-95cc6d71f313
📒 Files selected for processing (10)
crates/lib/src/core/db/merkle_node/merkle_node_db.rscrates/lib/src/error.rscrates/lib/src/model/merkle_tree/node/commit_node.rscrates/lib/src/model/merkle_tree/node/dir_node.rscrates/lib/src/model/merkle_tree/node/file_chunk_node.rscrates/lib/src/model/merkle_tree/node/file_node.rscrates/lib/src/model/merkle_tree/node/merkle_tree_node.rscrates/lib/src/model/merkle_tree/node/vnode.rscrates/lib/src/model/merkle_tree/node_type.rscrates/lib/src/repositories/tree.rs
CleanCut
left a comment
There was a problem hiding this comment.
We can reduce the complexity here quite a bit and still accomplish the goal!
| // we make sure we can convert the node's serialization error into an OxenError | ||
| OxenError: From<N::SerializationError>, | ||
| // we make sure that the node type implements TMerkleTreeNode | ||
| N: TMerkleTreeNode<SerializationError = S>, | ||
| // and we make sure that the nodes we're serializing all have the same serialization error type | ||
| VNode: TMerkleTreeNode<SerializationError = S>, | ||
| DirNode: TMerkleTreeNode<SerializationError = S>, | ||
| FileNode: TMerkleTreeNode<SerializationError = S>, | ||
| // NOTE: | ||
| // We could have dropped everything here, have no `S` generic, and have this in the where: | ||
| // N: TMerkleTreeNode<SerializationError = rmp_serde::encode::Error>, | ||
| // But, by doing it this way, we make it so that we can change the actual SerializationError | ||
| // type without needing to change this function's type. It works so long as the error type | ||
| // for all of the node's trait implementations align. | ||
| { |
There was a problem hiding this comment.
The bounds complexity here should mostly (completely?) go away with the elimination of SerializationError.
We wouldn't need the `S` generic anyway, but that's irrelevant now.
My original comment before I saw that we could eliminate SerializationError:
Go ahead and drop the S generic and simplify this.
The extra bounds complexity is unnecessary. All nodes share the same SerializationError type already, and if we change it, we'll change it everywhere, including here.
There was a problem hiding this comment.
This is an intentional part of the design. I don't want to overly rely on using OxenError, because that is a large enum where nearly all of the variants are not applicable to the code. There's only a single error -- serialization problem.
An explicit design I want to do more of is keep code as locally scoped as possible. Here, that meant only using a single serialization error type.
There's one way to simplify this, and that's to hardcode the serialization error type as a serde error. However, given the way I've documented this and explained it, and used names like "SerializationError", I feel that this is clear. The benefit is it gives the code the ability to switch to a new SerializationError without requiring this code to change. Which is good, since since this code is agnostic to the serialization details! :)
There was a problem hiding this comment.
I understand this was a deliberate design choice. My feedback is that it is not a good direction to go in.
I don't want to overly rely on using OxenError, because that is a large enum where nearly all of the variants are not applicable to the code.
OxenError exists so we have a consistent error type to use throughout the liboxen codebase. In other words, it is intended that we use OxenError throughout liboxen. In this context, there is no overly relying on it. I'd encourage us to lean into using it rather than working around it. There is nothing wrong with the error enum having a large number of variants. Fragmenting our error approach at this point is the wrong direction to go in.
This code will still be agnostic to serialization details using OxenError.
Hardcoding an error type specific to a single implementation into a trait is not a simplification.
| /// The error type that can occur during serialization. | ||
| type SerializationError: std::error::Error + Send + Sync + 'static; | ||
|
|
||
| /// Serialize this node to a MsgPack byte array. | ||
| fn to_msgpack_bytes(&self) -> Result<Vec<u8>, Self::SerializationError>; |
There was a problem hiding this comment.
We don't need an associated type here. Use OxenError. That should help eliminate a lot of the extra complexity!
| /// The error type that can occur during serialization. | |
| type SerializationError: std::error::Error + Send + Sync + 'static; | |
| /// Serialize this node to a MsgPack byte array. | |
| fn to_msgpack_bytes(&self) -> Result<Vec<u8>, Self::SerializationError>; | |
| /// Serialize this node to a MsgPack byte array. | |
| fn to_msgpack_bytes(&self) -> Result<Vec<u8>, OxenError>; |
There was a problem hiding this comment.
Explicitly not using OxenError. (I explained this in an earlier comment above, but I'll reiterate a bit here) That is too wide of a type -- it includes too many variants that should never be used here. Using it as the error type is confusing and allows the code to return invalid errors. It's only ever valid for it to return an error related to serialization.
This design is actually simpler if we stop to think about what all of this syntax is saying -- it's saying "this function only returns Vec<u8> or some single kind of error."
One issue with the oxen code today is it's very hard to understand what exact kind of error some piece of code returns if it has a Result<_, OxenError>. With this design, it's straightforward to see what specific error type the code returns by only looking at the signature (instead of reading through the entire definition and tracking down all of the possible variants of OxenError that are returned).
In turn, this design makes all of the concrete implementation code simpler: they all use rmp_serde::decode::Error instead of having OxenError. So we can look at those signatures and see "ok, the only thing that can go wrong is a serialization error" instead of "it can be any variant of an OxenError, which one(s) is it actually going to be?"
There was a problem hiding this comment.
Having a large number of variants in the error enum isn't a problem. It's pretty normal. It isn't clear to me why you find this confusing. On the other hand, having to search for scattered errors around the codebase and figure out when and where to use them would quickly become confusing.
The argument for dividing our single error enum into two error enums would be that there are two disjoint sets of errors that are used in mutually exclusive call stacks. We don't even have that situation.
One issue with the oxen code today is it's very hard to understand what exact kind of error some piece of code returns if it has a Result<_, OxenError>.
I don't think that is an issue. My observation has been that almost nothing that handles an error cares which specific error it was. In the few places where we do care, we match on the OxenError variant that we care about. Where's the issue? It's already straightforward to see the variant type. Isn't that why we've been replacing the BasicString variants with more specific variants, at your request--just in case someone cares in the future and wants to match on the variant?
In turn, this design makes all of the concrete implementation code simpler: they all use
rmp_serde::decode::Errorinstead of havingOxenError.
Your design makes the concrete implementation more complex. All the bounds stuff goes away when you switch to OxenError.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
crates/lib/src/model/merkle_tree/node_type.rs (1)
157-182: The new test still doesn't pin the two regressions this refactor cares about.Right now it only proves concrete impls exist and their errors convert into
OxenError. It won't fail ifTMerkleTreeNodestops beingdyn-compatible again, and it doesn't lock the repository-criticalu8mapping / invalid-byte path described above.Proposed test additions
#[test] fn test_nodes_implement_trait() { /// this only exists so we can check that all node types implement `TMerkleTreeNode` /// it will be a compile-time error if a node type does not implement the trait - fn is_tmerkletreenode<T: TMerkleTreeNode>(_: T) + fn is_tmerkletreenode<T: TMerkleTreeNode>(node: &T) where OxenError: From<T::SerializationError>, { + let _: &dyn TMerkleTreeNode<SerializationError = T::SerializationError> = node; } - is_tmerkletreenode(CommitNode::default()); - is_tmerkletreenode(DirNode::default()); - is_tmerkletreenode(VNode::default()); - is_tmerkletreenode(FileNode::default()); - is_tmerkletreenode(FileChunkNode::default()); + is_tmerkletreenode(&CommitNode::default()); + is_tmerkletreenode(&DirNode::default()); + is_tmerkletreenode(&VNode::default()); + is_tmerkletreenode(&FileNode::default()); + is_tmerkletreenode(&FileChunkNode::default()); } + + #[test] + fn test_node_type_u8_mapping_is_stable() { + for (node_type, value) in [ + (MerkleTreeNodeType::Commit, 0u8), + (MerkleTreeNodeType::Dir, 1u8), + (MerkleTreeNodeType::VNode, 2u8), + (MerkleTreeNodeType::File, 3u8), + (MerkleTreeNodeType::FileChunk, 4u8), + ] { + assert_eq!(node_type.to_u8(), value); + assert_eq!( + MerkleTreeNodeType::from_u8(value).expect("valid node type"), + node_type + ); + } + } + + #[test] + fn test_invalid_node_type_returns_error() { + assert!(matches!( + MerkleTreeNodeType::from_u8(u8::MAX), + Err(InvalidMerkleTreeNodeType(_)) + )); + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/lib/src/model/merkle_tree/node_type.rs` around lines 157 - 182, Add two stronger tests: (1) verify TMerkleTreeNode is object-safe by defining a helper that accepts &dyn TMerkleTreeNode (e.g., fn takes_dyn(_: &dyn TMerkleTreeNode) {}) and passing references like &CommitNode::default(), &DirNode::default(), &VNode::default(), &FileNode::default(), &FileChunkNode::default() to it; (2) pin the u8 mapping by exercising the NodeType (or equivalent) conversion from bytes (e.g., TryFrom<u8> / from_u8) and asserting each concrete node variant maps to the expected byte value and that out-of-range/invalid bytes produce an Err or InvalidByte result. Reference TMerkleTreeNode, the existing is_tmerkletreenode test, and the NodeType/from-u8 conversion functions to locate where to add these checks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/lib/src/model/merkle_tree/node_type.rs`:
- Around line 136-155: TMerkleTreeNode now adds an associated type
SerializationError and a method to_msgpack_bytes which breaks any manual impls;
update the crate by (1) adding clear migration docs and CHANGELOG entry
describing how to change any manual impls to declare type SerializationError =
rmp_serde::encode::Error and implement to_msgpack_bytes (or rely on the provided
blanket impl for types implementing Serialize + MerkleTreeNodeIdType + Debug +
Display), (2) mark this change as a major-version bump / deprecation period for
users with custom impls, and (3) optionally provide a short compatibility
note/example showing the exact replacement impl for TMerkleTreeNode (using type
SerializationError and the to_msgpack_bytes body from the blanket impl) so
external consumers can update easily.
---
Nitpick comments:
In `@crates/lib/src/model/merkle_tree/node_type.rs`:
- Around line 157-182: Add two stronger tests: (1) verify TMerkleTreeNode is
object-safe by defining a helper that accepts &dyn TMerkleTreeNode (e.g., fn
takes_dyn(_: &dyn TMerkleTreeNode) {}) and passing references like
&CommitNode::default(), &DirNode::default(), &VNode::default(),
&FileNode::default(), &FileChunkNode::default() to it; (2) pin the u8 mapping by
exercising the NodeType (or equivalent) conversion from bytes (e.g., TryFrom<u8>
/ from_u8) and asserting each concrete node variant maps to the expected byte
value and that out-of-range/invalid bytes produce an Err or InvalidByte result.
Reference TMerkleTreeNode, the existing is_tmerkletreenode test, and the
NodeType/from-u8 conversion functions to locate where to add these checks.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 80e1e045-7d88-49f3-91f8-12027386cb23
📒 Files selected for processing (1)
crates/lib/src/model/merkle_tree/node_type.rs
72fc293 to
4b3d9a4
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
crates/lib/src/model/merkle_tree/node_type.rs (2)
48-51: Expose the invalid byte on this public error.
from_u8is public, but callers outside this module cannot inspect which tag failed becauseInvalidMerkleTreeNodeTypehides its only payload. If you want to keep construction private, a small accessor is enough.♻️ Proposed refactor
pub struct InvalidMerkleTreeNodeType(u8); + +impl InvalidMerkleTreeNodeType { + pub fn invalid_byte(&self) -> u8 { + self.0 + } +}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/lib/src/model/merkle_tree/node_type.rs` around lines 48 - 51, The public error type InvalidMerkleTreeNodeType currently hides its payload so callers of from_u8 cannot see the invalid byte; add a public accessor on InvalidMerkleTreeNodeType (e.g. impl InvalidMerkleTreeNodeType { pub fn invalid_byte(&self) -> u8 { self.0 } }) so external code can inspect the offending u8 while keeping the tuple field private and preserving the existing error message and derives; update any callsites that need the byte to use invalid_byte().
166-181: Add behavioral tests for the new on-disk mapping helpers.
test_nodes_implement_traitonly checks compile-time impl coverage. It doesn't protect theto_u8/from_u8round-trip or the invalid-byte path, which are the actual compatibility contract in this file.🧪 Proposed tests
#[test] fn test_nodes_implement_trait() { /// this only exists so we can check that all node types implement `TMerkleTreeNode` /// it will be a compile-time error if a node type does not implement the trait fn is_tmerkletreenode<T: TMerkleTreeNode>(_: T) where OxenError: From<T::SerializationError>, { } is_tmerkletreenode(CommitNode::default()); is_tmerkletreenode(DirNode::default()); is_tmerkletreenode(VNode::default()); is_tmerkletreenode(FileNode::default()); is_tmerkletreenode(FileChunkNode::default()); } + + #[test] + fn test_node_type_round_trip() { + for node_type in [ + MerkleTreeNodeType::Commit, + MerkleTreeNodeType::Dir, + MerkleTreeNodeType::VNode, + MerkleTreeNodeType::File, + MerkleTreeNodeType::FileChunk, + ] { + assert_eq!(MerkleTreeNodeType::from_u8(node_type.to_u8()).unwrap(), node_type); + } + } + + #[test] + fn test_invalid_node_type_byte() { + assert!(matches!( + MerkleTreeNodeType::from_u8(255), + Err(InvalidMerkleTreeNodeType(_)) + )); + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/lib/src/model/merkle_tree/node_type.rs` around lines 166 - 181, Extend the tests to verify runtime behavior of the on-disk mapping helpers: for each node type used in test_nodes_implement_trait (CommitNode, DirNode, VNode, FileNode, FileChunkNode) call the node's to_u8 (or equivalent mapping function) and then from_u8 to assert the round-trip yields the original enum variant (or an equivalent equality/assertion), and add a separate test that passes an invalid/unknown byte value into from_u8 to assert it returns the expected error/None result; update or add tests near test_nodes_implement_trait and reference TMerkleTreeNode, to_u8, from_u8 to ensure coverage of both valid round-trips and invalid-byte handling.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/lib/src/model/merkle_tree/node_type.rs`:
- Around line 94-96: The panic message in MerkleTreeNodeType::from_u8_unwrap
uses a literal "{val}" instead of the actual value and loses caller location;
annotate from_u8_unwrap with #[track_caller] and change the error to include the
value (e.g., call from_u8(val).unwrap_or_else(|| panic!("Invalid
MerkleTreeNodeType: {}", val)) or use expect with format!) so the actual u8 is
printed and the caller location is preserved; reference functions
MerkleTreeNodeType::from_u8_unwrap and MerkleTreeNodeType::from_u8 when applying
this change.
---
Nitpick comments:
In `@crates/lib/src/model/merkle_tree/node_type.rs`:
- Around line 48-51: The public error type InvalidMerkleTreeNodeType currently
hides its payload so callers of from_u8 cannot see the invalid byte; add a
public accessor on InvalidMerkleTreeNodeType (e.g. impl
InvalidMerkleTreeNodeType { pub fn invalid_byte(&self) -> u8 { self.0 } }) so
external code can inspect the offending u8 while keeping the tuple field private
and preserving the existing error message and derives; update any callsites that
need the byte to use invalid_byte().
- Around line 166-181: Extend the tests to verify runtime behavior of the
on-disk mapping helpers: for each node type used in test_nodes_implement_trait
(CommitNode, DirNode, VNode, FileNode, FileChunkNode) call the node's to_u8 (or
equivalent mapping function) and then from_u8 to assert the round-trip yields
the original enum variant (or an equivalent equality/assertion), and add a
separate test that passes an invalid/unknown byte value into from_u8 to assert
it returns the expected error/None result; update or add tests near
test_nodes_implement_trait and reference TMerkleTreeNode, to_u8, from_u8 to
ensure coverage of both valid round-trips and invalid-byte handling.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 1e3e88d7-ac2e-487d-9896-76f4d9ab3dd4
📒 Files selected for processing (1)
crates/lib/src/model/merkle_tree/node_type.rs
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
crates/lib/src/model/merkle_tree/node_type.rs (1)
94-96:⚠️ Potential issue | 🟡 MinorInterpolate the invalid byte and preserve the caller location.
Line 95's
expect("Invalid MerkleTreeNodeType: {val}")prints{val}literally, and without#[track_caller]the panic is reported at this helper instead of the bad call site.🐛 Proposed fix
+ #[track_caller] pub fn from_u8_unwrap(val: u8) -> MerkleTreeNodeType { - Self::from_u8(val).expect("Invalid MerkleTreeNodeType: {val}") + match Self::from_u8(val) { + Ok(node_type) => node_type, + Err(_) => panic!("Invalid MerkleTreeNodeType: {}", val), + } }In Rust, does `Result::expect("Invalid MerkleTreeNodeType: {val}")` interpolate `{val}`? Also, does annotating a wrapper like `from_u8_unwrap` with `#[track_caller]` preserve the original caller location for the panic?🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/lib/src/model/merkle_tree/node_type.rs` around lines 94 - 96, The helper from_u8_unwrap currently passes a literal "{val}" to expect and doesn't preserve the caller location; update the function MerkleTreeNodeType::from_u8_unwrap to (1) interpolate the byte into the panic message by constructing the string (e.g., using format! with the val) instead of a raw literal, and (2) add #[track_caller] to the from_u8_unwrap definition so any panic from expect reports the original caller site rather than inside this helper.crates/lib/src/core/db/merkle_node/merkle_node_db.rs (1)
325-328:⚠️ Potential issue | 🟠 MajorReturn invalid node-type bytes as
Err, not panics.These bytes are loaded straight from the node files, so
from_u8_unwrap()means one corrupt record—or a repo written with a newer node variant—can abortopen()/map()instead of surfacing anOxenError.🐛 Proposed fix
let dtype = lookup .as_ref() - .map(|l| MerkleTreeNodeType::from_u8_unwrap(l.data_type)) + .map(|l| MerkleTreeNodeType::from_u8(l.data_type)) + .transpose()? .unwrap_or(MerkleTreeNodeType::Commit);- let data_type = MerkleTreeNodeType::from_u8_unwrap(lookup.data_type); + let data_type = MerkleTreeNodeType::from_u8(lookup.data_type)?;- let dtype = MerkleTreeNodeType::from_u8_unwrap(*dtype); + let dtype = MerkleTreeNodeType::from_u8(*dtype)?;Also applies to: 498-499, 517-517
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/lib/src/core/db/merkle_node/merkle_node_db.rs` around lines 325 - 328, Replace uses of MerkleTreeNodeType::from_u8_unwrap (which can panic) with a safe conversion that returns an Err when the byte is invalid: call the non-panicking conversion (e.g., MerkleTreeNodeType::from_u8 or equivalent) and if it returns None/Err, return an OxenError (or map into the function's Result) with context including the invalid byte value and node id. Update the code paths where lookup.as_ref().map(...).unwrap_or(...) is used (the snippet using lookup and MerkleTreeNodeType::from_u8_unwrap and the other occurrences around the other two locations) to propagate a proper Err instead of panicking so open()/map() surfaces an OxenError for corrupt/unknown node types.
🧹 Nitpick comments (1)
crates/lib/src/core/db/merkle_node/merkle_node_db.rs (1)
257-261: Add?Sizedif these helpers are supposed to acceptdyn TMerkleTreeNode.Lines 257, 274, 367, and 415 use
N: TMerkleTreeNode, which implicitly requiresSized. This prevents passing&dyn TMerkleTreeNodetrait objects. Adding?Sizedto these generic bounds would allow dynamic dispatch if needed.♻️ Proposed fix
- pub fn open_read_write_if_not_exists<N: TMerkleTreeNode>( + pub fn open_read_write_if_not_exists<N: TMerkleTreeNode + ?Sized>( repo: &LocalRepository, node: &N, parent_id: Option<MerkleHash>, ) -> Result<Option<Self>, OxenError> { @@ - pub fn open_read_write<N: TMerkleTreeNode>( + pub fn open_read_write<N: TMerkleTreeNode + ?Sized>( repo: &LocalRepository, node: &N, parent_id: Option<MerkleHash>, ) -> Result<Self, OxenError> { @@ - fn write_node<N: TMerkleTreeNode>( + fn write_node<N: TMerkleTreeNode + ?Sized>( &mut self, node: &N, parent_id: Option<MerkleHash>, ) -> Result<(), OxenError> { @@ - pub fn add_child<N: TMerkleTreeNode>(&mut self, item: &N) -> Result<(), OxenError> { + pub fn add_child<N: TMerkleTreeNode + ?Sized>(&mut self, item: &N) -> Result<(), OxenError> {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/lib/src/core/db/merkle_node/merkle_node_db.rs` around lines 257 - 261, Change the generic bound from N: TMerkleTreeNode to N: TMerkleTreeNode + ?Sized for the helper functions that need to accept trait objects (e.g., open_read_write_if_not_exists and the other methods at the same sites). Update the generic declarations for those functions (the ones currently declared with N: TMerkleTreeNode) to use N: TMerkleTreeNode + ?Sized so callers can pass &dyn TMerkleTreeNode; keep parameters as &N and no other API changes are required.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/lib/src/repositories/tree.rs`:
- Around line 1081-1088: The doc comment for the tree-writing function still
references a generic serialization error type `S` and the requirement that all
node types share the same `S`; update the comment to reflect the current
implementation where `TMerkleTreeNode` uses the concrete
rmp_serde::encode::Error (see crates/lib/src/model/merkle_tree/node_type.rs) or
simply remove the stale note about `S` so the comment no longer claims a generic
serialization type requirement; ensure references to the function that writes
the full tree (the doc block in crates/lib/src/repositories/tree.rs) mention
that serialization errors are now fixed to rmp_serde::encode::Error or omit the
`S` discussion entirely.
---
Duplicate comments:
In `@crates/lib/src/core/db/merkle_node/merkle_node_db.rs`:
- Around line 325-328: Replace uses of MerkleTreeNodeType::from_u8_unwrap (which
can panic) with a safe conversion that returns an Err when the byte is invalid:
call the non-panicking conversion (e.g., MerkleTreeNodeType::from_u8 or
equivalent) and if it returns None/Err, return an OxenError (or map into the
function's Result) with context including the invalid byte value and node id.
Update the code paths where lookup.as_ref().map(...).unwrap_or(...) is used (the
snippet using lookup and MerkleTreeNodeType::from_u8_unwrap and the other
occurrences around the other two locations) to propagate a proper Err instead of
panicking so open()/map() surfaces an OxenError for corrupt/unknown node types.
In `@crates/lib/src/model/merkle_tree/node_type.rs`:
- Around line 94-96: The helper from_u8_unwrap currently passes a literal
"{val}" to expect and doesn't preserve the caller location; update the function
MerkleTreeNodeType::from_u8_unwrap to (1) interpolate the byte into the panic
message by constructing the string (e.g., using format! with the val) instead of
a raw literal, and (2) add #[track_caller] to the from_u8_unwrap definition so
any panic from expect reports the original caller site rather than inside this
helper.
---
Nitpick comments:
In `@crates/lib/src/core/db/merkle_node/merkle_node_db.rs`:
- Around line 257-261: Change the generic bound from N: TMerkleTreeNode to N:
TMerkleTreeNode + ?Sized for the helper functions that need to accept trait
objects (e.g., open_read_write_if_not_exists and the other methods at the same
sites). Update the generic declarations for those functions (the ones currently
declared with N: TMerkleTreeNode) to use N: TMerkleTreeNode + ?Sized so callers
can pass &dyn TMerkleTreeNode; keep parameters as &N and no other API changes
are required.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 042f2c23-3d27-4244-a7fc-7c6128a4ccc0
📒 Files selected for processing (3)
crates/lib/src/core/db/merkle_node/merkle_node_db.rscrates/lib/src/model/merkle_tree/node_type.rscrates/lib/src/repositories/tree.rs
|
Made one sweeping change that hopefully simplifies things a bit for this review. Instead of having a generic This cuts down on a lot of the syntax -- there's no more I also made the change to use the new |
TMerkleTreeNode dyn compatible (fka object safe)TMerkleTreeNode dyn compatible + rm from_u8 panic!
TMerkleTreeNode dyn compatible + rm from_u8 panic!TMerkleTreeNode dyn compatible + rm from_u8 panic!
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/lib/src/model/merkle_tree/node_type.rs`:
- Line 70: In node_type.rs replace the misspelled word "conflcit" with
"conflict" in the comment text (the sentence containing "value to not conflcit
with any existing variants") so the comment reads "value to not conflict with
any existing variants"; ensure only the typo is corrected and surrounding
wording/formatting is preserved.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: d0bb04b8-4470-422f-9948-0e7cae43b2dd
📒 Files selected for processing (3)
crates/lib/src/core/db/merkle_node/merkle_node_db.rscrates/lib/src/model/merkle_tree/node_type.rscrates/lib/src/repositories/tree.rs
✅ Files skipped from review due to trivial changes (1)
- crates/lib/src/repositories/tree.rs
| changed for existing `MerkleTreeNode` variants. | ||
|
|
||
| **NEW** variants can be added without a migration. New variants must have an incremented `u8` | ||
| value to not conflcit with any existing variants. Older repositories will still be able to be |
There was a problem hiding this comment.
Typo: "conflcit" → "conflict".
📝 Proposed fix
- **NEW** variants can be added without a migration. New variants must have an incremented `u8`
- value to not conflcit with any existing variants. Older repositories will still be able to be
+ **NEW** variants can be added without a migration. New variants must have an incremented `u8`
+ value to not conflict with any existing variants. Older repositories will still be able to be📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| value to not conflcit with any existing variants. Older repositories will still be able to be | |
| value to not conflict with any existing variants. Older repositories will still be able to be |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/lib/src/model/merkle_tree/node_type.rs` at line 70, In node_type.rs
replace the misspelled word "conflcit" with "conflict" in the comment text (the
sentence containing "value to not conflcit with any existing variants") so the
comment reads "value to not conflict with any existing variants"; ensure only
the typo is corrected and surrounding wording/formatting is preserved.
trait TMerkleTreeNodeChangesFirst in a series of PRs to abstract the Merkle tree store so that we can provide different
backing implementations other than the existing custom file format based implementation.
This PR changes drops the
+ Serializeconstraint totrait TMerkleTreeNodeto make itdynCompatible (this was formally known asobject safe).Serializehas a methodserialize<S: Serializer>that prevents it from being compiled as a virtual table lookup,hence it's not
dynCompatible. [1]The change here drops the
+ Serializeconstraint from the trait and instead adds a newmethod
to_msgpack_bytes() -> Result<Vec<u8>, ...>, which performs the serialization.There's a blanket implementation that implements this updated trait for all existing concrete
node implementations: it uses their
Serializeimpl. to provide a generic implementation forto_msgpack_byteswith a shared serialization error ofrmp_serde::encode::Error.Sites that were manually calling
serializeon the node now callto_msgpack_bytes().There's a new conversion of
encode::Errorinto anOxenErrorwrapper.[1] https://doc.rust-lang.org/reference/items/traits.html#r-items.traits.dyn-compatible
Return Serde Error More Often
Several functions were returning an
OxenErroreven though they were only ever returninga serde serialization error (
rmp_serde::encode::Error). Since there's anOxenErrorvariant +
Fromimplementation to convert these, the functions' signatures have beenmodified to return the more tightly scoped serde error and have callers rely on using
a conversion via the
?operator into anOxenError.from_u8changesliboxen::model::merkle_tree::node_type::from_u8has a breaking change: instead of apanic!when an invalidu8value is supplied, it returns anErr(.). A new error typehas been added
InvalidMerkleTreeNodeTypefor this case and a new variant ofOxenError::MerkleTreeErrorhas been added that wraps this error. Call sites have beenupdated to propagate this error up to an
OxenError.