Skip to content

[Spark] Support CHANGES (Auto-CDF) reads in AUTO v2 enable mode#7031

Draft
gengliangwang wants to merge 2 commits into
delta-io:masterfrom
gengliangwang:cdcAutoMode
Draft

[Spark] Support CHANGES (Auto-CDF) reads in AUTO v2 enable mode#7031
gengliangwang wants to merge 2 commits into
delta-io:masterfrom
gengliangwang:cdcAutoMode

Conversation

@gengliangwang

Copy link
Copy Markdown
Collaborator

Description

A SELECT ... CHANGES FROM VERSION ... TO VERSION ... query routes through TableCatalog.loadChangelogChangelogSupport.loadChangelog (spark-4.2 shim). That path resolves the table via loadTable, which in AUTO mode returns the V1 connector (DeltaTableV2), since DeltaV2Mode.shouldCatalogReturnV2Tables() is only true for STRICT. Auto-CDF is implemented only by the V2 connector, so the read failed with:

[DELTA_CHANGELOG_REQUIRES_V2_TABLE] Auto-CDF reads on `default.orders` require the V2 Delta
connector, but the catalog resolved the table to `...DeltaTableV2`.

As a result, CHANGES queries only worked when the session forced STRICT mode.

Changes

  • DeltaV2Mode.shouldRouteChangelogToV2() — new predicate, true for AUTO and STRICT, false for NONE. Intentionally distinct from shouldCatalogReturnV2Tables(): AUTO keeps general batch reads/writes on the V1 connector (full feature support) and only opts into V2 for V2-supported operations like CHANGES.
  • ChangelogSupport.loadChangelog (spark-4.2) — when loadTable returns a V1 DeltaTableV2 but the mode routes CHANGES to V2, re-resolve the same table as a DeltaV2Table (via catalog table or path). NONE mode still throws the existing error.
  • delta-error-classes.json / DeltaErrors.scala — updated the error message and docs to reflect that AUTO (now) or STRICT enable V2 CHANGES reads.

Testing

Added to DeltaChangelogCatalogIntegrationTest:

  • testAutoModeRoutesChangesToV2CHANGES FROM VERSION 1 TO VERSION 3 succeeds under AUTO without forcing STRICT.
  • testNoneModeRejectsChangesNONE still rejected with DELTA_CHANGELOG_REQUIRES_V2_TABLE.

Verified with -DsparkVersion=4.2 (the Auto-CDF code only exists in the spark-4.2 shim, per SPARK-56685): Java + Scala + scalastyle + Checkstyle clean, and the full DeltaChangelogCatalogIntegrationTest passes 19/19 (including pre-existing STRICT-mode tests).

This pull request and its description were written by Isaac.

A `SELECT ... CHANGES FROM VERSION ... TO VERSION ...` query routes
through `TableCatalog.loadChangelog` -> `ChangelogSupport.loadChangelog`.
That path resolved the table via `loadTable`, which in AUTO mode returns
the V1 connector (`DeltaTableV2`). Auto-CDF is only implemented by the V2
connector, so the read failed with `DELTA_CHANGELOG_REQUIRES_V2_TABLE`
and only worked when the session forced STRICT mode.

This adds `DeltaV2Mode.shouldRouteChangelogToV2()` (true for AUTO and
STRICT) and updates `ChangelogSupport.loadChangelog` to re-resolve a V1
`DeltaTableV2` as a `DeltaV2Table` for the CHANGES read when the mode
permits. AUTO keeps general batch reads/writes on the V1 connector and
only opts into V2 for the V2-supported CHANGES operation; NONE still
rejects with the existing error.

Tests: added `testAutoModeRoutesChangesToV2` (CHANGES succeeds under
AUTO without STRICT) and `testNoneModeRejectsChanges` (NONE still
rejected) to `DeltaChangelogCatalogIntegrationTest`.

Co-authored-by: Isaac
…logTable

Extract the V1 DeltaTableV2 -> V2 DeltaV2Table re-resolution out of the
ChangelogSupport trait and into DeltaCatalog, where all V1/V2 connector
construction already lives. The trait now delegates via the abstract
asV2ChangelogTable method instead of reaching into DeltaTableV2 internals
(catalogTable/path), keeping the changelog trait thin and
connector-construction-agnostic.

The DeltaCatalog method is intentionally not annotated @OverRide: it
satisfies the abstract declaration only in the Spark 4.2 ChangelogSupport
shim; the 4.0/4.1 shims use an empty trait where it is an unused helper.

Co-authored-by: Isaac
@gengliangwang gengliangwang marked this pull request as draft June 14, 2026 00:34
@gengliangwang gengliangwang requested a review from johanl-db June 15, 2026 18:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant