[Spark] Run DeltaSinkSuite against both path-based and name-based access#7081
Open
PorridgeSwim wants to merge 2 commits into
Open
[Spark] Run DeltaSinkSuite against both path-based and name-based access#7081PorridgeSwim wants to merge 2 commits into
PorridgeSwim wants to merge 2 commits into
Conversation
murali-db
approved these changes
Jun 24, 2026
BrooksWalls
reviewed
Jun 24, 2026
| .trigger(org.apache.spark.sql.streaming.Trigger.Once) | ||
| .option("checkpointLocation", chkDir) | ||
| .start(tableDir) | ||
| withTable("test_delta_sink") { |
Collaborator
There was a problem hiding this comment.
why does this test use withTable and not withSinkTarget like the rest?
bb9fff5 to
5acab6c
Compare
BrooksWalls
approved these changes
Jun 24, 2026
zikangh
reviewed
Jun 24, 2026
| /** | ||
| * Run a sink test against a name-based (catalog) target table. | ||
| */ | ||
| protected def withSinkTarget(f: (String, File) => Unit): Unit = { |
Collaborator
There was a problem hiding this comment.
I'm concerned we are losing test coverage for path-based tables.
Collaborator
Author
There was a problem hiding this comment.
preserved the path-based tests
Add a useDsv2 flag so the suite keeps its original path-based coverage (useDsv2 = false) while also exercising the name-based write seam that the DSv2 sink requires (useDsv2 = true, via DeltaSinkNameBasedSuite). Co-authored-by: Isaac
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which Delta project/connector is this regarding?
Description
Test-only change that makes
DeltaSinkSuiterun against both path-based and name-based table access, rather than migrating it wholesale to name-based.An earlier revision of this PR replaced path-based access with name-based access, which dropped the original path-based coverage. This revision parameterizes the suite instead:
useDsv2flag selects how the target is addressed:useDsv2 = false(default) — path-based:writeStream...start(path),spark.read.format("delta").load(path),DeltaLog.forTable(spark, path). This preserves the suite's original coverage.useDsv2 = true— name-based:writeStream...toTable(name),spark.read.table(name),DeltaLog.forTable(spark, TableIdentifier(name)). This exercises the name-based write seam, which is the only path available to the DSv2 (Kernel) sink —toTablerejects path identifiers.withSinkTargethelper plus a few small companion helpers (startStream,readTarget,deltaLogForTarget,deltaTableForTarget) interpret thetargetidentifier per mode, so each test body is written once and runs both ways.DeltaSinkSuite(and its existingCatalogManagedBatch1/2subclasses) keepuseDsv2 = false; a newDeltaSinkNameBasedSuiteoverrides it totrueso both modes actually run in CI.Two tests genuinely behave differently between the two access paths and branch internally on
useDsv2:AnalysisException("Partition columns do not match"); name-based throwsIllegalArgumentException("The provided partitioning or clustering columns do not match the existing table's").StreamingQueryException; name-based rejects the all-columns partitioning up front attoTablewith anAnalysisException.A few tests remain intrinsically path-specific (e.g.
path not specified,incompatible schema merging ...) and are unchanged.Motivation (same as #7058 for the type-widening tests): unblock running these tests against the DSv2 (Kernel) connector, which has no path-based streaming write, while retaining the existing path-based coverage.
How was this patch tested?
Test-only change. Both
DeltaSinkSuite(path-based) andDeltaSinkNameBasedSuite(name-based) pass.Does this PR introduce any user-facing changes?
No.