Skip to content

Refactor MerkleNodeDb to use repo's Path directly#523

Merged
malcolmgreaves merged 1 commit into
mainfrom
mg/refactor_use_path_merklenodedb
May 5, 2026
Merged

Refactor MerkleNodeDb to use repo's Path directly#523
malcolmgreaves merged 1 commit into
mainfrom
mg/refactor_use_path_merklenodedb

Conversation

@malcolmgreaves
Copy link
Copy Markdown
Collaborator

The MerkleNodeDb only uses the LocalRepository's path. All functions
have changed to borrow the path instead of an entire LocalRepository instance.

Additionally, since this type is only used in liboxen and it is not an
explicit public API of oxen, it and its associated functions have been
changed to the pub(crate) visibility level. This identified some dead
code, which, along with some commented-out code, has been removed.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 4, 2026

📝 Walkthrough

Summary by CodeRabbit

  • Refactor
    • Internal node storage now relies on repository path references and has a smaller public surface, improving internal modularity and safety.
  • Bug Fixes
    • Node lookups and traversals are more resilient to missing or unreadable node data, preventing cascade failures.
  • User impact
    • No UI changes; expect improved reliability and fewer unexpected read/write errors.

Walkthrough

Refactors MerkleNodeDB to use repository filesystem paths (&Path) instead of LocalRepository objects, reduces MerkleNodeDB/Lookup visibility to crate-private, introduces an internal open routine, and updates all call sites to pass &repo.path or &repository.path accordingly.

Changes

MerkleNodeDB API Refactor & Call-site updates

Layer / File(s) Summary
Data Shape / API surface
crates/lib/src/core/db/merkle_node/merkle_node_db.rs
node_db_path now takes repo_path: &Path and visibility of MerkleNodeDB, MerkleNodeLookup, and several APIs (exists, open_read_only, open_read_write, close, add_child, map, to_node) reduced to crate-private or made non-public.
Core implementation
crates/lib/src/core/db/merkle_node/merkle_node_db.rs
Introduces fn open(path: PathBuf, read_only: bool, node_id: MerkleHash); open_read_only/open_read_write now call into it and set node_id from the caller; internal deserialization helper to_node made private; file/lookup handling preserved.
Wiring / Call sites — writes & commits
crates/lib/src/core/v_latest/commits.rs, crates/lib/src/repositories/commits/commit_writer.rs, crates/lib/src/repositories/tree.rs, crates/lib/src/command/migrate/...m20250111083535_add_child_counts_to_nodes.rs, crates/lib/src/core/v_latest/merge.rs
All open_read_write call sites updated to pass &repo.path (or &repository.path) as the first argument; node_db_path call sites updated to use &repository.path.
Wiring / Call sites — reads & indexes
crates/lib/src/core/v_latest/index/commit_merkle_tree.rs, crates/lib/src/core/v_old/v0_19_0/index/commit_merkle_tree.rs, crates/lib/src/model/merkle_tree/node/merkle_tree_node.rs, crates/lib/src/core/v_latest/entries.rs
exists and open_read_only call sites updated to use &repo.path; legacy reader traversal updated to guard/open child DBs with &repo.path and skip recursion on open failures where applicable.
Module re-export
crates/lib/src/core/db/merkle_node.rs
Top-level re-export changed from pub usepub(crate) use, restricting MerkleNodeDB visibility to the crate.
Tests / Documentation
(none changed)
No tests or docs modified in this diff.

Sequence Diagram(s)

(omitted)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • Oxen-AI/Oxen#470: Refactors MerkleNodeDB API and node path/signature changes that callers must adapt to.
  • Oxen-AI/Oxen#506: Modifies the same migration and touches the same MerkleNodeDB call sites.
  • Oxen-AI/Oxen#634: Also changes node DB path construction and overlaps with node path/API adjustments.

Suggested reviewers

  • CleanCut
  • rpschoenburg

"I hopped through hashes, light on my paws,
Replaced big repo objects with filesystem laws.
Quieted the DB, kept children in line,
Paths now lead home where the node files shine. 🐇"

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 71.43% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: refactoring MerkleNodeDb to accept repository paths instead of LocalRepository objects.
Description check ✅ Passed The description clearly explains the refactoring changes to MerkleNodeDb, the visibility changes to pub(crate), and mentions removal of dead/commented code.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch mg/refactor_use_path_merklenodedb

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/lib/src/core/v_old/v0_19_0/index/commit_merkle_tree.rs (1)

538-542: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not abort sibling traversal when one child DB is missing.

At Line 541, return Ok(()) exits read_children_from_node for the entire current node. If one child is missing, remaining siblings are skipped, producing incomplete trees.

💡 Suggested fix
-                        let Ok(mut node_db) = MerkleNodeDB::open_read_only(&repo.path, &child.hash)
-                        else {
-                            log::warn!("no child node db: {:?}", child.hash);
-                            return Ok(());
-                        };
+                        let Ok(mut node_db) = MerkleNodeDB::open_read_only(&repo.path, &child.hash)
+                        else {
+                            log::warn!("no child node db: {:?}", child.hash);
+                            continue;
+                        };
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/lib/src/core/v_old/v0_19_0/index/commit_merkle_tree.rs` around lines
538 - 542, In read_children_from_node, don’t abort processing the rest of the
siblings when MerkleNodeDB::open_read_only(&repo.path, &child.hash) fails for
one child; instead change the pattern that currently does `let Ok(mut node_db) =
... else { log::warn!(...); return Ok(()); }` to either a match or if-let that
logs the error and uses `continue` to skip the missing child (so subsequent
siblings are still processed), ensuring any later code that uses node_db is only
executed in the success branch.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/lib/src/core/db/merkle_node/merkle_node_db.rs`:
- Around line 305-309: The open() implementation for MerkleNodeDb is creating
directories unconditionally which performs writes even when called for read-only
probes; modify open(path: PathBuf, read_only: bool, node_id: MerkleHash) so that
util::fs::create_dir_all(&path) is only invoked when read_only is false (i.e.,
skip directory creation for read-only opens), leaving missing node paths
untouched; keep the rest of open() behavior unchanged and ensure callers like
open_read_only() continue to pass read_only=true so probes do not mutate the
filesystem.

---

Outside diff comments:
In `@crates/lib/src/core/v_old/v0_19_0/index/commit_merkle_tree.rs`:
- Around line 538-542: In read_children_from_node, don’t abort processing the
rest of the siblings when MerkleNodeDB::open_read_only(&repo.path, &child.hash)
fails for one child; instead change the pattern that currently does `let Ok(mut
node_db) = ... else { log::warn!(...); return Ok(()); }` to either a match or
if-let that logs the error and uses `continue` to skip the missing child (so
subsequent siblings are still processed), ensuring any later code that uses
node_db is only executed in the success branch.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 22b3b8dd-74da-4ddc-9974-a0338c891371

📥 Commits

Reviewing files that changed from the base of the PR and between 695f078 and 8948e3f.

📒 Files selected for processing (9)
  • crates/lib/src/command/migrate/m20250111083535_add_child_counts_to_nodes.rs
  • crates/lib/src/core/db/merkle_node/merkle_node_db.rs
  • crates/lib/src/core/v_latest/commits.rs
  • crates/lib/src/core/v_latest/entries.rs
  • crates/lib/src/core/v_latest/index/commit_merkle_tree.rs
  • crates/lib/src/core/v_old/v0_19_0/index/commit_merkle_tree.rs
  • crates/lib/src/model/merkle_tree/node/merkle_tree_node.rs
  • crates/lib/src/repositories/commits/commit_writer.rs
  • crates/lib/src/repositories/tree.rs

Comment thread crates/lib/src/core/db/merkle_node/merkle_node_db.rs
@malcolmgreaves malcolmgreaves force-pushed the mg/refactor_use_path_merklenodedb branch from 8948e3f to b009a7f Compare May 4, 2026 22:01
@malcolmgreaves malcolmgreaves force-pushed the mg/refactor_use_path_merklenodedb branch from b009a7f to 6736d09 Compare May 5, 2026 00:53
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/lib/src/core/v_old/v0_19_0/index/commit_merkle_tree.rs (1)

538-542: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid aborting sibling traversal when one child DB is missing.

At Line 541, return Ok(()) exits read_children_from_node for the entire current node. A single missing child DB can silently drop the current child and all remaining siblings.

Suggested fix
-                        let Ok(mut node_db) = MerkleNodeDB::open_read_only(&repo.path, &child.hash)
-                        else {
-                            log::warn!("no child node db: {:?}", child.hash);
-                            return Ok(());
-                        };
+                        let Ok(mut node_db) = MerkleNodeDB::open_read_only(&repo.path, &child.hash)
+                        else {
+                            log::warn!("no child node db: {:?}", child.hash);
+                            node.children.push(child);
+                            continue;
+                        };
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/lib/src/core/v_old/v0_19_0/index/commit_merkle_tree.rs` around lines
538 - 542, In read_children_from_node, do not return early when
MerkleNodeDB::open_read_only(&repo.path, &child.hash) fails; instead log the
missing child (you already call log::warn!("no child node db: {:?}",
child.hash)) and continue iterating siblings so a single missing child doesn’t
abort traversal. Replace the current early return that follows the
MerkleNodeDB::open_read_only failure with logic to skip this child and proceed
(keep using node_db/child.hash to locate the failing child).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@crates/lib/src/core/v_old/v0_19_0/index/commit_merkle_tree.rs`:
- Around line 538-542: In read_children_from_node, do not return early when
MerkleNodeDB::open_read_only(&repo.path, &child.hash) fails; instead log the
missing child (you already call log::warn!("no child node db: {:?}",
child.hash)) and continue iterating siblings so a single missing child doesn’t
abort traversal. Replace the current early return that follows the
MerkleNodeDB::open_read_only failure with logic to skip this child and proceed
(keep using node_db/child.hash to locate the failing child).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e171a8bb-f9fa-4a57-b11d-ddcec91c463d

📥 Commits

Reviewing files that changed from the base of the PR and between b009a7f and 6736d09.

📒 Files selected for processing (10)
  • crates/lib/src/command/migrate/m20250111083535_add_child_counts_to_nodes.rs
  • crates/lib/src/core/db/merkle_node.rs
  • crates/lib/src/core/db/merkle_node/merkle_node_db.rs
  • crates/lib/src/core/v_latest/commits.rs
  • crates/lib/src/core/v_latest/entries.rs
  • crates/lib/src/core/v_latest/index/commit_merkle_tree.rs
  • crates/lib/src/core/v_old/v0_19_0/index/commit_merkle_tree.rs
  • crates/lib/src/model/merkle_tree/node/merkle_tree_node.rs
  • crates/lib/src/repositories/commits/commit_writer.rs
  • crates/lib/src/repositories/tree.rs
✅ Files skipped from review due to trivial changes (5)
  • crates/lib/src/core/v_latest/entries.rs
  • crates/lib/src/command/migrate/m20250111083535_add_child_counts_to_nodes.rs
  • crates/lib/src/repositories/commits/commit_writer.rs
  • crates/lib/src/core/v_latest/commits.rs
  • crates/lib/src/core/db/merkle_node/merkle_node_db.rs
🚧 Files skipped from review as they are similar to previous changes (4)
  • crates/lib/src/core/db/merkle_node.rs
  • crates/lib/src/model/merkle_tree/node/merkle_tree_node.rs
  • crates/lib/src/core/v_latest/index/commit_merkle_tree.rs
  • crates/lib/src/repositories/tree.rs

The `MerkleNodeDb` only uses the `LocalRepository`'s path. All functions
have changed to borrow the path instead of an entire `LocalRepository` instance.

Additionally, since this type is only used in `liboxen` and it is not an
explicit public API of oxen, it and its associated functions have been
changed to the `pub(crate)` visibility level.

This refactor identified some dead code, which, along with some commented-out
code, has been removed.
@malcolmgreaves malcolmgreaves force-pushed the mg/refactor_use_path_merklenodedb branch from 6736d09 to c53b23d Compare May 5, 2026 01:17
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/lib/src/core/v_old/v0_19_0/index/commit_merkle_tree.rs (1)

538-542: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid aborting traversal when one child DB is missing.

On Line 541, return Ok(()) exits read_children_from_node for the first missing child DB, so remaining siblings are skipped and results can be partially truncated.

Suggested fix
-                        let Ok(mut node_db) = MerkleNodeDB::open_read_only(&repo.path, &child.hash)
-                        else {
-                            log::warn!("no child node db: {:?}", child.hash);
-                            return Ok(());
-                        };
-                        // log::debug!("read_children_from_node opened node_db: {:?}", child.hash);
-                        CommitMerkleTree::read_children_from_node(
-                            repo,
-                            &mut node_db,
-                            &mut child,
-                            recurse,
-                        )?;
+                        if let Ok(mut node_db) =
+                            MerkleNodeDB::open_read_only(&repo.path, &child.hash)
+                        {
+                            // log::debug!("read_children_from_node opened node_db: {:?}", child.hash);
+                            CommitMerkleTree::read_children_from_node(
+                                repo,
+                                &mut node_db,
+                                &mut child,
+                                recurse,
+                            )?;
+                        } else {
+                            log::warn!("no child node db: {:?}", child.hash);
+                        }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/lib/src/core/v_old/v0_19_0/index/commit_merkle_tree.rs` around lines
538 - 542, The code in read_children_from_node currently returns early when
MerkleNodeDB::open_read_only(&repo.path, &child.hash) fails, which aborts
traversal and skips remaining siblings; change the error path to log the missing
child (keep log::warn!("no child node db: {:?}", child.hash)) and continue the
loop instead of returning Ok(()), so subsequent children are processed; ensure
any variables that depend on node_db are only used when open_read_only succeeded
(i.e., scope node_db use inside the success branch) and preserve the function's
overall return behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@crates/lib/src/core/v_old/v0_19_0/index/commit_merkle_tree.rs`:
- Around line 538-542: The code in read_children_from_node currently returns
early when MerkleNodeDB::open_read_only(&repo.path, &child.hash) fails, which
aborts traversal and skips remaining siblings; change the error path to log the
missing child (keep log::warn!("no child node db: {:?}", child.hash)) and
continue the loop instead of returning Ok(()), so subsequent children are
processed; ensure any variables that depend on node_db are only used when
open_read_only succeeded (i.e., scope node_db use inside the success branch) and
preserve the function's overall return behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ef8a89b6-f8c1-4398-8b62-ebe150454639

📥 Commits

Reviewing files that changed from the base of the PR and between 6736d09 and c53b23d.

📒 Files selected for processing (11)
  • crates/lib/src/command/migrate/m20250111083535_add_child_counts_to_nodes.rs
  • crates/lib/src/core/db/merkle_node.rs
  • crates/lib/src/core/db/merkle_node/merkle_node_db.rs
  • crates/lib/src/core/v_latest/commits.rs
  • crates/lib/src/core/v_latest/entries.rs
  • crates/lib/src/core/v_latest/index/commit_merkle_tree.rs
  • crates/lib/src/core/v_latest/merge.rs
  • crates/lib/src/core/v_old/v0_19_0/index/commit_merkle_tree.rs
  • crates/lib/src/model/merkle_tree/node/merkle_tree_node.rs
  • crates/lib/src/repositories/commits/commit_writer.rs
  • crates/lib/src/repositories/tree.rs
✅ Files skipped from review due to trivial changes (7)
  • crates/lib/src/core/db/merkle_node.rs
  • crates/lib/src/repositories/tree.rs
  • crates/lib/src/core/v_latest/entries.rs
  • crates/lib/src/repositories/commits/commit_writer.rs
  • crates/lib/src/model/merkle_tree/node/merkle_tree_node.rs
  • crates/lib/src/command/migrate/m20250111083535_add_child_counts_to_nodes.rs
  • crates/lib/src/core/v_latest/commits.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • crates/lib/src/core/v_latest/index/commit_merkle_tree.rs
  • crates/lib/src/core/db/merkle_node/merkle_node_db.rs

Copy link
Copy Markdown
Contributor

@CleanCut CleanCut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️ Excellent!

@malcolmgreaves malcolmgreaves merged commit 083cf6b into main May 5, 2026
9 checks passed
@malcolmgreaves malcolmgreaves deleted the mg/refactor_use_path_merklenodedb branch May 5, 2026 17:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants