[Spark] support identity creation via sparksql by mwc360 · Pull Request #7062 · delta-io/delta

mwc360 · 2026-06-19T19:19:17Z

Resolves #7061

Which Delta project/connector is this regarding?

Description

Adds support for the Spark 4.0 SQL DDL identity-column syntax in Delta, e.g.:

CREATE TABLE t (
   id1 BIGINT GENERATED ALWAYS AS IDENTITY,
   id2 BIGINT GENERATED ALWAYS AS IDENTITY (START WITH -1 INCREMENT BY 1),
   id3 BIGINT GENERATED BY DEFAULT AS IDENTITY,
   id4 BIGINT GENERATED BY DEFAULT AS IDENTITY (START WITH -1 INCREMENT BY 1)
)

Previously the syntax parsed but Spark refused to dispatch to the catalog (capability not advertised), and even if it did, Spark's V2 Column.identityColumnSpec() was being dropped by the default CatalogV2Util.v2ColumnsToStructType conversion before reaching Delta's create path, so the resulting tables had no identity metadata.

How it works

Spark 4.0 stores identity info on the V2 Column object ( Column.identityColumnSpec() ), not in StructField metadata. The default fallback in StagingTableCatalog converts Column[] → StructType via CatalogV2Util.v2ColumnsToStructType, which drops IdentityColumnSpec. By overriding the Column[] overloads in AbstractDeltaCatalog and using a Delta-aware converter, the identity info is preserved into Delta's metadata keys before it reaches createDeltaTable . From there, the existing identity codepath ( ColumnWithDefaultExprUtils.isIdentityColumn, IdentityColumnsTableFeature auto-enable, write-time generation, admission checks) takes over unchanged.

How was this patch tested?

A new test suite plus regression testing of existing identity suites.

Does this PR introduce any user-facing changes?

No, new functionality.

felipepessoto · 2026-06-20T01:17:03Z

@newfront @timothyw553 could you trigger CI and help find a reviewer, please? This solves an important issue, part of the roadmap: Linux Foundation Delta Lake Roadmap (view)

timothyw553 · 2026-06-21T16:45:45Z

hi @mwc360 the CI is failing, could you take a look?

mwc360 · 2026-06-22T14:39:20Z

@timothyw553 - scalastyle issue is fixed, can you retrigger CI?

…dentity_sql

mwc360 · 2026-06-22T15:27:47Z

@timothyw553 - sorry, missed something. Just tested that it compiles successfully. pls trigger CI again.

timothyw553 · 2026-06-22T17:22:50Z

triggered

mwc360 · 2026-06-22T19:04:39Z

@timothyw553 - fyi, everything is passing

support identity via sparksql

f2f2b28

mwc360 mentioned this pull request Jun 19, 2026

[Feature Request] SQL syntax for GENERATED columns in OSS #1100

Open

fix scalasytle issue

b8e5916

mwc360 added 2 commits June 22, 2026 09:16

fix missing import

b7aa402

Merge remote-tracking branch 'origin/mcole_identity_sql' into mcole_i…

f82eaa3

…dentity_sql

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Spark] support identity creation via sparksql#7062

[Spark] support identity creation via sparksql#7062
mwc360 wants to merge 4 commits into
delta-io:masterfrom
mwc360:mcole_identity_sql

mwc360 commented Jun 19, 2026

Uh oh!

felipepessoto commented Jun 20, 2026

Uh oh!

timothyw553 commented Jun 21, 2026

Uh oh!

mwc360 commented Jun 22, 2026

Uh oh!

mwc360 commented Jun 22, 2026

Uh oh!

timothyw553 commented Jun 22, 2026

Uh oh!

mwc360 commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

mwc360 commented Jun 19, 2026

Which Delta project/connector is this regarding?

Description

How it works

How was this patch tested?

Does this PR introduce any user-facing changes?

Uh oh!

felipepessoto commented Jun 20, 2026

Uh oh!

timothyw553 commented Jun 21, 2026

Uh oh!

mwc360 commented Jun 22, 2026

Uh oh!

mwc360 commented Jun 22, 2026

Uh oh!

timothyw553 commented Jun 22, 2026

Uh oh!

mwc360 commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants