feat(weave): dataset_sources provenance endpoints#7199
Draft
jwlee64 wants to merge 1 commit into
Draft
Conversation
Four endpoints over the dataset_sources table (migration 034): - dataset_sources_link: batch idempotent write; validates source existence (calls via calls_merged, spans via the agents module); deterministic UUIDv5 link ids over the logical key; optional include_created_status (skips the pre-insert lookup when False) - dataset_sources_link_delete: soft delete via tombstone versions; fail-fast on unknown ids; per-id deleted flags - dataset_sources_query: forward lookup (dataset -> sources), prefix scan + argMax collapse, HAVING on collapsed deleted_at - source_datasets_query: reverse lookup (sources -> datasets), bloom-assisted, server-side aggregation with capped row_digests ClickHouse impl is insert-only (no mutations); SQLite impl uses logical-key upserts and supports call sources (spans are ClickHouse-only). Nullable columns tuple-wrapped in argMax to avoid ClickHouse NULL-skipping; aggregate args table-qualified to avoid alias shadowing. Tested on both backends: 27 passed (ClickHouse), 25 passed + 2 expected span skips (SQLite).
Contributor
Author
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
|
Preview this PR with FeatureBee: https://beta.wandb.ai/?betaVersion=64b1bfaaa32c37baa5f8cc19140bc70618afc88d |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Description
What does the PR do? Include a concise description of the PR contents.
Testing
How was this PR tested?