Reorder tree traversal during writing#531
Reorder tree traversal during writing#531malcolmgreaves wants to merge 1 commit intomg/config_merkle_storefrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR refactors the merkle directory writer from a recursive pattern into a phased implementation. The caller now writes the root directory hash and invokes the refactored ChangesMerkle Directory Writer Phased Refactoring
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
bd9e34b to
154ba43
Compare
|
NOTE: Stacked PR! #526 must be merged beforehand. |
154ba43 to
7285f50
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
crates/lib/src/repositories/commits/commit_writer.rs (1)
2176-2302: 💤 Low valueLGTM - Solid regression test with deterministic inputs.
The test correctly:
- Pins the timestamp (line 2205) to ensure both patterns get the same
commit_id- Uses empty
existing_nodesto simulate an initial commit scenario- Validates both the hash key set and per-node structural equivalence
One optional enhancement: this test covers the initial commit case where
existing_nodesis empty. A follow-up test exercising a subsequent commit (whereexisting_nodescontains prior directory nodes and some subdirs use the "look up old dir node" path at line 951) would strengthen coverage of the refactored code paths.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/lib/src/repositories/commits/commit_writer.rs` around lines 2176 - 2302, Add a second test variant that mirrors pattern_c_byte_equivalent_to_pattern_a but primes existing_nodes with prior directory nodes to exercise the "look up old dir node" path: create a prior commit (use create_commit_data, compute_commit_id, CommitNode::new and CommitMerkleTree::dir_hash_db_path_from_commit_id), populate a HashMap<PathBuf, MerkleTreeNode> named existing_nodes with the prior MerkleTreeNode entries for relevant subdirs, then call split_into_vnodes(&repo, &dir_entries, &existing_nodes, &new_commit) and run both write_commit_entries_pattern_a_legacy and write_commit_entries comparisons as in the original test; ensure the timestamp is pinned again so commit_id is deterministic and assert snapshot equality exactly like the existing test.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@crates/lib/src/repositories/commits/commit_writer.rs`:
- Around line 2176-2302: Add a second test variant that mirrors
pattern_c_byte_equivalent_to_pattern_a but primes existing_nodes with prior
directory nodes to exercise the "look up old dir node" path: create a prior
commit (use create_commit_data, compute_commit_id, CommitNode::new and
CommitMerkleTree::dir_hash_db_path_from_commit_id), populate a HashMap<PathBuf,
MerkleTreeNode> named existing_nodes with the prior MerkleTreeNode entries for
relevant subdirs, then call split_into_vnodes(&repo, &dir_entries,
&existing_nodes, &new_commit) and run both write_commit_entries_pattern_a_legacy
and write_commit_entries comparisons as in the original test; ensure the
timestamp is pinned again so commit_id is deterministic and assert snapshot
equality exactly like the existing test.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 07bf3888-41e2-4d6e-8b35-eb8c1b4144e7
📒 Files selected for processing (1)
crates/lib/src/repositories/commits/commit_writer.rs
7285f50 to
c55b4e4
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/lib/src/repositories/commits/commit_writer.rs (1)
836-840:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRemove the legacy caller-side dir-hash invalidation.
Now that
write_commit_entriesinvalidates from the normalizedremoved_childrenlist, the old post-writecache_invalidate_dir_hash_db(dir_entries.values())calls incommit_dir_entries_newandcommit_dir_entriesbecome dangerous. If staging emits a removed directory as a leaf name, those old calls can deletedir_0instead offiles/dir_0and clobber an unrelated cache entry after the correct normalized delete already ran.Suggested cleanup
@@ - // Remove all the directories that are staged for removal - cache_invalidate_dir_hash_db(&dir_hash_db, dir_entries.values())?; - Ok(node.to_commit()) @@ - // Remove all the directories that are staged for removal - cache_invalidate_dir_hash_db(&dir_hash_db, dir_entries.values())?; - Ok(node.to_commit())🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/lib/src/repositories/commits/commit_writer.rs` around lines 836 - 840, Remove the legacy caller-side dir-hash invalidation that is now harmful: delete the calls to cache_invalidate_dir_hash_db(...) that pass dir_entries.values() from the commit_dir_entries_new and commit_dir_entries functions because write_commit_entries already invalidates using the normalized removed_children list; ensure no other code paths call cache_invalidate_dir_hash_db with raw dir entry names (e.g., dir_entries.values()) so we only invalidate using the normalized removed_children provided by write_commit_entries.
🧹 Nitpick comments (1)
crates/lib/src/repositories/commits/commit_writer.rs (1)
2279-2283: ⚡ Quick winWipe or assert
dir_hash_dbbetween the two runs.Leaving the commit's dir-hash DB in place means run
#2can inherit stale keys from run#1, and this test never compares that DB anyway. A missed or mis-keyedstr_val_db::putin the updated path would still pass here as long as the legacy run populated the key first.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/lib/src/repositories/commits/commit_writer.rs` around lines 2279 - 2283, The wipe_tree_nodes function currently only removes .oxen/tree/nodes but leaves the dir_hash_db in place; change wipe_tree_nodes (or add a companion helper invoked between runs) to either delete/clear the commit's dir_hash_db (the on-disk DB under .oxen/tree/dir_hash_db) or assert that it is empty before the second run so keys from the first run cannot leak; reference the dir_hash_db instance used by the code that calls str_val_db::put and ensure you call the DB's clear/remove operation (or perform an explicit emptiness check/assert) on the same LocalRepository dir_hash_db used by the commit writer.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/lib/src/repositories/commits/commit_writer.rs`:
- Around line 2232-2266: The loop over walkdir currently swallows errors and
skips problematic node dirs (using filter_map(Result::ok) and many `continue`s),
which hides partial-write/malformed-node failures; update the code in
commit_writer.rs (the WalkDir iteration and the processing of each entry, e.g.
where nodes_dir, MerkleHash::new, and store.get_node are used) to propagate
errors instead of continuing: stop using filter_map(Result::ok) and handle Err
from the iterator by returning an Err, and replace the `continue` branches for
metadata failure, strip_prefix failure, unexpected component count, hex parse
failure, and missing store.get_node (None) with appropriate Err returns (with
context) so the snapshot fails on unreadable or missing node directories.
---
Outside diff comments:
In `@crates/lib/src/repositories/commits/commit_writer.rs`:
- Around line 836-840: Remove the legacy caller-side dir-hash invalidation that
is now harmful: delete the calls to cache_invalidate_dir_hash_db(...) that pass
dir_entries.values() from the commit_dir_entries_new and commit_dir_entries
functions because write_commit_entries already invalidates using the normalized
removed_children list; ensure no other code paths call
cache_invalidate_dir_hash_db with raw dir entry names (e.g.,
dir_entries.values()) so we only invalidate using the normalized
removed_children provided by write_commit_entries.
---
Nitpick comments:
In `@crates/lib/src/repositories/commits/commit_writer.rs`:
- Around line 2279-2283: The wipe_tree_nodes function currently only removes
.oxen/tree/nodes but leaves the dir_hash_db in place; change wipe_tree_nodes (or
add a companion helper invoked between runs) to either delete/clear the commit's
dir_hash_db (the on-disk DB under .oxen/tree/dir_hash_db) or assert that it is
empty before the second run so keys from the first run cannot leak; reference
the dir_hash_db instance used by the code that calls str_val_db::put and ensure
you call the DB's clear/remove operation (or perform an explicit emptiness
check/assert) on the same LocalRepository dir_hash_db used by the commit writer.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 3a43757a-a676-41f3-bc89-1494f2e6031b
📒 Files selected for processing (1)
crates/lib/src/repositories/commits/commit_writer.rs
| for entry in walkdir::WalkDir::new(&nodes_dir) | ||
| .follow_links(false) | ||
| .min_depth(2) | ||
| .max_depth(2) | ||
| .into_iter() | ||
| .filter_map(Result::ok) | ||
| { | ||
| let path = entry.path(); | ||
| let Ok(meta) = path.metadata() else { continue }; | ||
| if !meta.is_dir() { | ||
| continue; | ||
| } | ||
| let rel = match path.strip_prefix(&nodes_dir) { | ||
| Ok(p) => p, | ||
| Err(_) => continue, | ||
| }; | ||
| let components: Vec<&str> = rel | ||
| .components() | ||
| .filter_map(|c| match c { | ||
| std::path::Component::Normal(s) => s.to_str(), | ||
| _ => None, | ||
| }) | ||
| .collect(); | ||
| if components.len() != 2 { | ||
| continue; | ||
| } | ||
| let hex = format!("{}{}", components[0], components[1]); | ||
| let Ok(hash_value) = u128::from_str_radix(&hex, 16) else { | ||
| continue; | ||
| }; | ||
| let hash = MerkleHash::new(hash_value); | ||
|
|
||
| let Some(record) = store.get_node(&hash)? else { | ||
| continue; | ||
| }; |
There was a problem hiding this comment.
Fail the snapshot on unreadable node directories.
This helper currently drops walk errors and skips hashes that cannot be reopened from the store. That can hide exactly the partial-write or malformed-node regressions this test is supposed to catch; a directory under tree/nodes/<prefix>/<suffix> should be treated as required, not optional.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/lib/src/repositories/commits/commit_writer.rs` around lines 2232 -
2266, The loop over walkdir currently swallows errors and skips problematic node
dirs (using filter_map(Result::ok) and many `continue`s), which hides
partial-write/malformed-node failures; update the code in commit_writer.rs (the
WalkDir iteration and the processing of each entry, e.g. where nodes_dir,
MerkleHash::new, and store.get_node are used) to propagate errors instead of
continuing: stop using filter_map(Result::ok) and handle Err from the iterator
by returning an Err, and replace the `continue` branches for metadata failure,
strip_prefix failure, unexpected component count, hex parse failure, and missing
store.get_node (None) with appropriate Err returns (with context) so the
snapshot fails on unreadable or missing node directories.
5c4ad32 to
9e10a92
Compare
f22e66f to
8d71608
Compare
fd61e09 to
64e1254
Compare
8d71608 to
d2b7fde
Compare
64e1254 to
4886e86
Compare
d2b7fde to
3feb38a
Compare
4886e86 to
3d3b1fd
Compare
3feb38a to
7dc9c8b
Compare
Reorders the tree traversal when writing a new Merkle tree during a commit from depth-first to a layerwise depth-first approach. Before, `r_create_dir_node` would start writing the `node` & `children` files for a directory and keep open file handles while it descended into each child. It would only close file handles once it recursed up the call stack. Since each recursive descent created a _new_ node write session (for the node that was being recursed on), recursion meant that the current stack frame's session was not being modified by a recursive call. Any modification to `node` or `children` files was occuring only within a single stack frame. In the new approach, a directory's vnodes are staged in a `Vec` in the same stack frame. They're directly written into the `children` file (and their offsets stored in the `node` file) in one iteration. Any files in the directory (via a vnode) are included in the `node` and `children` files in this same pass. The dir node's write session is finished and the collected vnodes are recursed one at a time. One performance benefit of this change is that there is only a constant number of open file handles for the `FileBackend`: a `node` and `children` file per call to `r_create_dir_node`. The file handles are no longer persisted through recursive calls, meaning that the number of open file handles no longer grows linearly with respect to the Merkle tree depth. Regression tests have been added to show that the `node` and `children` file contents are byte identicial to before this change.
7dc9c8b to
d1ae637
Compare
Reorders the tree traversal when writing a new Merkle tree during
a commit from depth-first to a layerwise depth-first approach. Before,
r_create_dir_nodewould start writing thenode&childrenfilesfor a directory and keep open file handles while it descended into each
child. It would only close file handles once it recursed up the call stack.
Since each recursive descent created a new node write session (for the
node that was being recursed on), recursion meant that the current stack
frame's session was not being modified by a recursive call. Any modification
to
nodeorchildrenfiles was occuring only within a single stack frame.In the new approach, a directory's vnodes are staged in a
Vecin the samestack frame. They're directly written into the
childrenfile (and theiroffsets stored in the
nodefile) in one iteration. Any files in the directory(via a vnode) are included in the
nodeandchildrenfiles in this same pass.The dir node's write session is finished and the collected vnodes are recursed
one at a time.
One performance benefit of this change is that there is only a constant number
of open file handles for the
FileBackend: anodeandchildrenfile percall to
r_create_dir_node. The file handles are no longer persisted throughrecursive calls, meaning that the number of open file handles no longer grows
linearly with respect to the Merkle tree depth.
Regression tests have been added to show that the
nodeandchildrenfilecontents are byte identicial to before this change.