Problem
Three functions independently call doc.diff(before, after) and walk the same patches:
| Function |
Crate |
Calls doc.diff()? |
Produces |
diff_cells() |
notebook-doc |
Yes |
CellChangeset |
diff_metadata_touched() |
notebook-doc |
Yes |
bool |
compute_text_attributions() |
runtimed-wasm (local fn) |
Yes |
Vec<TextAttribution> |
The WASM receive_frame() path calls diff_cells + compute_text_attributions = 2 doc.diff() calls with identical patches. The daemon calls diff_metadata_touched separately.
Additionally, the WASM path invalidates metadata_fingerprint_cache on every doc change (including cell source edits and output streaming), forcing re-serialization ~30/sec during execution. See the TODO at runtimed-wasm/src/lib.rs:1440.
Proposal
One function in notebook-doc that diffs once, walks patches once, classifies everything:
pub struct DocChangeset {
pub cells: CellChangeset,
pub metadata_changed: bool,
pub text_patches: Vec<TextPatch>, // raw splice/delete per cell source
}
pub fn diff_doc(
doc: &mut AutoCommit,
before: &[ChangeHash],
after: &[ChangeHash],
) -> DocChangeset
text_patches
Captures the raw splice/delete operations on cell source text — the same data compute_text_attributions currently extracts, but without actor labels (those come from extract_change_actors via get_changes, a separate query). The WASM consumer combines text patches + actors into TextAttribution.
Consumer changes
WASM (runtimed-wasm):
let changeset = diff_doc(doc, &before, &after);
let actors = extract_change_actors(doc, &before);
let attributions = build_attributions(&changeset.text_patches, &actors);
if changeset.metadata_changed {
self.metadata_fingerprint_cache = None; // fixes the TODO
}
Daemon (runtimed):
let changeset = diff_doc(doc, &before, &after);
if changeset.metadata_changed {
check_and_broadcast_sync_state(room).await;
}
What moves where
- Patch-walking guts of
compute_text_attributions move into diff_doc() in notebook-doc
compute_text_attributions stays in runtimed-wasm as a thin combiner (text patches + actors → TextAttribution)
diff_cells() becomes a convenience wrapper: diff_doc(...).cells
diff_metadata_touched() becomes: diff_doc(...).metadata_changed
extract_change_actors() stays as-is (uses get_changes, not diff)
Benefits
- WASM: one
doc.diff() per sync frame instead of two
- WASM: metadata fingerprint cache only invalidated when metadata actually changed
- Daemon: one
doc.diff() instead of separate metadata check
- Extension point: adding new patch categories is just a new field on
DocChangeset
Design question
Whether text_patches should surface raw Automerge patch data or a more structured type (cell_id + index + text + deleted). The structured approach is cleaner for consumers but couples notebook-doc to the text attribution concept. Raw patches keep notebook-doc focused on classification only.
Context
Problem
Three functions independently call
doc.diff(before, after)and walk the same patches:doc.diff()?diff_cells()CellChangesetdiff_metadata_touched()boolcompute_text_attributions()Vec<TextAttribution>The WASM
receive_frame()path callsdiff_cells+compute_text_attributions= 2doc.diff()calls with identical patches. The daemon callsdiff_metadata_touchedseparately.Additionally, the WASM path invalidates
metadata_fingerprint_cacheon every doc change (including cell source edits and output streaming), forcing re-serialization ~30/sec during execution. See the TODO atruntimed-wasm/src/lib.rs:1440.Proposal
One function in
notebook-docthat diffs once, walks patches once, classifies everything:text_patchesCaptures the raw splice/delete operations on cell source text — the same data
compute_text_attributionscurrently extracts, but without actor labels (those come fromextract_change_actorsviaget_changes, a separate query). The WASM consumer combines text patches + actors intoTextAttribution.Consumer changes
WASM (
runtimed-wasm):Daemon (
runtimed):What moves where
compute_text_attributionsmove intodiff_doc()in notebook-doccompute_text_attributionsstays in runtimed-wasm as a thin combiner (text patches + actors → TextAttribution)diff_cells()becomes a convenience wrapper:diff_doc(...).cellsdiff_metadata_touched()becomes:diff_doc(...).metadata_changedextract_change_actors()stays as-is (usesget_changes, notdiff)Benefits
doc.diff()per sync frame instead of twodoc.diff()instead of separate metadata checkDocChangesetDesign question
Whether
text_patchesshould surface raw Automerge patch data or a more structured type (cell_id + index + text + deleted). The structured approach is cleaner for consumers but couples notebook-doc to the text attribution concept. Raw patches keep notebook-doc focused on classification only.Context
diff_cellsanddiff_metadata_touchedadded in refactor(runtimed): patch-based metadata detection and typed EnvKind #1663compute_text_attributionsis inruntimed-wasm/src/lib.rs:1647runtimed-wasm/src/lib.rs:1440extract_change_actorsis innotebook-doc/src/diff.rs:328(usesget_changes, notdiff)