Skip to content

[Spark] Migrate DeltaSinkSuite from path-based to name-based access#7081

Open
PorridgeSwim wants to merge 1 commit into
delta-io:masterfrom
PorridgeSwim:nameBasedDeltaSinkSuite
Open

[Spark] Migrate DeltaSinkSuite from path-based to name-based access#7081
PorridgeSwim wants to merge 1 commit into
delta-io:masterfrom
PorridgeSwim:nameBasedDeltaSinkSuite

Conversation

@PorridgeSwim

@PorridgeSwim PorridgeSwim commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator

Which Delta project/connector is this regarding?

  • Spark
  • Standalone
  • Flink
  • Kernel
  • Other (fill in here)

Description

Test-only refactor that migrates DeltaSinkSuite from path-based to name-based table access:

  • writes go through writeStream...toTable(name) instead of ...start(path)
  • reads / log lookups go through spark.read.table(name) / DeltaLog.forTable(spark, TableIdentifier(name)) instead of ...load(path) / forTable(spark, path)

A new withSinkTarget helper runs each migrated test against a uniquely-named managed table.

Motivation (same as #7058 for the type-widening tests):

  • Name-based access is the more common, idiomatic way to address Delta tables.
  • It unblocks running these tests against the DSv2 (Kernel) connector, which has no path-based streaming write.

A few tests are intentionally left path-based because their behavior is intrinsically path-specific — e.g. path not specified (asserts start() with no path errors) and incompatible schema merging ... (relies on save(path)'s schema-merge-and-reject; catalog writes resolve-and-cast instead, so the merge conflict never occurs).

How was this patch tested?

Test-only change. The existing DeltaSinkSuite tests continue to pass with name-based access.

Does this PR introduce any user-facing changes?

No.

@PorridgeSwim PorridgeSwim self-assigned this Jun 23, 2026

@xzhseh xzhseh left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the effort!

.trigger(org.apache.spark.sql.streaming.Trigger.Once)
.option("checkpointLocation", chkDir)
.start(tableDir)
withTable("test_delta_sink") {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does this test use withTable and not withSinkTarget like the rest?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch

@PorridgeSwim PorridgeSwim force-pushed the nameBasedDeltaSinkSuite branch from bb9fff5 to 5acab6c Compare June 24, 2026 18:06
@PorridgeSwim PorridgeSwim requested a review from BrooksWalls June 24, 2026 18:10
/**
* Run a sink test against a name-based (catalog) target table.
*/
protected def withSinkTarget(f: (String, File) => Unit): Unit = {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned we are losing test coverage for path-based tables.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants