[Spark] Run DeltaSinkSuite against both path-based and name-based access by PorridgeSwim · Pull Request #7081 · delta-io/delta

PorridgeSwim · 2026-06-23T23:52:40Z

Which Delta project/connector is this regarding?

Description

Test-only change that makes DeltaSinkSuite run against both path-based and name-based table access, rather than migrating it wholesale to name-based.

An earlier revision of this PR replaced path-based access with name-based access, which dropped the original path-based coverage. This revision parameterizes the suite instead:

A useDsv2 flag selects how the target is addressed:
- useDsv2 = false (default) — path-based: writeStream...start(path), spark.read.format("delta").load(path), DeltaLog.forTable(spark, path). This preserves the suite's original coverage.
- useDsv2 = true — name-based: writeStream...toTable(name), spark.read.table(name), DeltaLog.forTable(spark, TableIdentifier(name)). This exercises the name-based write seam, which is the only path available to the DSv2 (Kernel) sink — toTable rejects path identifiers.
A withSinkTarget helper plus a few small companion helpers (startStream, readTarget, deltaLogForTarget, deltaTableForTarget) interpret the target identifier per mode, so each test body is written once and runs both ways.
DeltaSinkSuite (and its existing CatalogManagedBatch1/2 subclasses) keep useDsv2 = false; a new DeltaSinkNameBasedSuite overrides it to true so both modes actually run in CI.

Two tests genuinely behave differently between the two access paths and branch internally on useDsv2:

"throw exception ... write in batch with different partitioning" — path-based throws AnalysisException ("Partition columns do not match"); name-based throws IllegalArgumentException ("The provided partitioning or clustering columns do not match the existing table's").
"can't write out with all columns being partition columns" — path-based starts the stream and then surfaces a StreamingQueryException; name-based rejects the all-columns partitioning up front at toTable with an AnalysisException.

A few tests remain intrinsically path-specific (e.g. path not specified, incompatible schema merging ...) and are unchanged.

Motivation (same as #7058 for the type-widening tests): unblock running these tests against the DSv2 (Kernel) connector, which has no path-based streaming write, while retaining the existing path-based coverage.

How was this patch tested?

Test-only change. Both DeltaSinkSuite (path-based) and DeltaSinkNameBasedSuite (name-based) pass.

Does this PR introduce any user-facing changes?

No.

xzhseh

Thanks for the effort!

BrooksWalls · 2026-06-24T17:01:39Z

-        .trigger(org.apache.spark.sql.streaming.Trigger.Once)
-        .option("checkpointLocation", chkDir)
-        .start(tableDir)
+      withTable("test_delta_sink") {


why does this test use withTable and not withSinkTarget like the rest?

zikangh · 2026-06-24T18:45:32Z

+  /**
+   * Run a sink test against a name-based (catalog) target table.
+   */
+  protected def withSinkTarget(f: (String, File) => Unit): Unit = {


I'm concerned we are losing test coverage for path-based tables.

preserved the path-based tests

Add a useDsv2 flag so the suite keeps its original path-based coverage (useDsv2 = false) while also exercising the name-based write seam that the DSv2 sink requires (useDsv2 = true, via DeltaSinkNameBasedSuite). Co-authored-by: Isaac

PorridgeSwim self-assigned this Jun 23, 2026

xzhseh approved these changes Jun 24, 2026

View reviewed changes

murali-db approved these changes Jun 24, 2026

View reviewed changes

BrooksWalls reviewed Jun 24, 2026

View reviewed changes

Migrate DeltaSinkSuite from path-based to name-based access

5acab6c

PorridgeSwim force-pushed the nameBasedDeltaSinkSuite branch from bb9fff5 to 5acab6c Compare June 24, 2026 18:06

PorridgeSwim requested a review from BrooksWalls June 24, 2026 18:10

BrooksWalls approved these changes Jun 24, 2026

View reviewed changes

zikangh reviewed Jun 24, 2026

View reviewed changes

PorridgeSwim changed the title ~~[Spark] Migrate DeltaSinkSuite from path-based to name-based access~~ [Spark] Run DeltaSinkSuite against both path-based and name-based access Jun 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Spark] Run DeltaSinkSuite against both path-based and name-based access#7081

[Spark] Run DeltaSinkSuite against both path-based and name-based access#7081
PorridgeSwim wants to merge 2 commits into
delta-io:masterfrom
PorridgeSwim:nameBasedDeltaSinkSuite

PorridgeSwim commented Jun 23, 2026 •

edited

Loading

Uh oh!

xzhseh left a comment

Uh oh!

BrooksWalls Jun 24, 2026

Uh oh!

PorridgeSwim Jun 24, 2026

Uh oh!

zikangh Jun 24, 2026

Uh oh!

PorridgeSwim Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

PorridgeSwim commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which Delta project/connector is this regarding?

Description

How was this patch tested?

Does this PR introduce any user-facing changes?

Uh oh!

xzhseh left a comment

Choose a reason for hiding this comment

Uh oh!

BrooksWalls Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

PorridgeSwim Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

zikangh Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

PorridgeSwim Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

PorridgeSwim commented Jun 23, 2026 •

edited

Loading