Skip to content

[WIP][Spark][kernel] Translate Kernel exceptions to DSv1 Spark/Delta exceptions #5900#7025

Draft
sotikoug83 wants to merge 1 commit into
delta-io:masterfrom
sotikoug83:kernel-exception-mismatches-5900
Draft

[WIP][Spark][kernel] Translate Kernel exceptions to DSv1 Spark/Delta exceptions #5900#7025
sotikoug83 wants to merge 1 commit into
delta-io:masterfrom
sotikoug83:kernel-exception-mismatches-5900

Conversation

@sotikoug83

@sotikoug83 sotikoug83 commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Which Delta project/connector is this regarding?

  • Spark
  • Standalone
  • Flink
  • Kernel
  • Other

Description

(This PR was originally meant to address #5900, but was scoped down to #7071. This PR is left as a potential general prototype/PoC solution to the Kernel exception mismatches)

The Kernel-backed DSv2 connector currently surfaces raw io.delta.kernel.exceptions.* exceptions where the DSv1 connector throws Spark/Delta exceptions. Two known mismatches:

  • "no schema should throw an exception":
    • An empty _delta_log streaming read throws Kernel's TableNotFoundException instead of AnalysisException ("Table schema is not set ... CREATE TABLE").
  • "Delta sources should verify the protocol reader version"
    • An unsupported protocol reader version throws Kernel's UnsupportedProtocolVersionException instead of org.apache.spark.sql.delta.InvalidProtocolVersionException.

This PR adds an extensible translation layer in a new io.delta.spark.internal.v2.exception package:

  • KernelExceptionConverter: a registry (Map<Class, Handler>), mapping another KernelException subclass is a one-line HANDLERS.put(...) addition.
  • Operation: handlers receive the table path and the operation, so one Kernel exception can translate differently per context.

How was this patch tested?

New unit tests for the converter:

build/sbt "sparkV2/testOnly io.delta.spark.internal.v2.exception.KernelExceptionConverterTest"

Moved the two previously failing tests ("no schema should throw an exception" and "Delta sources should verify the protocol reader version") from FailingTests to PassingTests in DeltaV2SourceSuite and ran the full V2 suite to check for regressions.

Does this PR introduce any user-facing changes?

Only under STRICT mode: exception types for the two scenarios above change from Kernel exceptions to the same Spark/Delta exceptions the DSv1 connector throws.

…tions

Signed-off-by: Sotirios Kougiouris <sotirisjr@kougiouris.org>
@sotikoug83 sotikoug83 force-pushed the kernel-exception-mismatches-5900 branch from 44d7a82 to 8625c28 Compare June 13, 2026 01:07

@huan233usc huan233usc left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[AI Generated] Left two correctness comments around preserving DSv1 exception behavior for missing paths and protocol-version errors.

UnsupportedProtocolVersionException source = (UnsupportedProtocolVersionException) e;
boolean isReader =
source.getVersionType() == UnsupportedProtocolVersionException.ProtocolVersionType.READER;
return new InvalidProtocolVersionException(

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[AI Generated] Passing 0 for whichever protocol side Kernel did not flag makes the Spark/Delta error parameters inaccurate. DSv1 constructs InvalidProtocolVersionException from the full table protocol, so an unsupported reader protocol still reports the table's actual minWriterVersion in the user-facing DELTA_INVALID_PROTOCOL_VERSION message. Can we carry both required protocol versions through Kernel, or avoid this translation until the converter can preserve both values? The unit tests should also assert both readerRequiredVersion and writerRequiredVersion so this does not regress silently.

this.initialSnapshot = snapshotManager.loadLatestSnapshot();
Snapshot loadedSnapshot;
try {
loadedSnapshot = snapshotManager.loadLatestSnapshot();

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[AI Generated] DeltaV2Table is shared by batch and streaming reads, but this TABLE_RESOLUTION path maps every Kernel TableNotFoundException to schemaNotSetException. Kernel also throws TableNotFoundException when _delta_log is missing, while DSv1 batch reads still surface DELTA_PATH_DOES_NOT_EXIST for a missing path. Can we distinguish the batch vs streaming/table-state cases here, or add a missing-path batch regression test under STRICT mode? Otherwise this fixes the empty-log streaming test by changing the missing-path batch error.

@sotikoug83 sotikoug83 closed this Jun 22, 2026
@sotikoug83 sotikoug83 deleted the kernel-exception-mismatches-5900 branch June 22, 2026 18:34
@sotikoug83 sotikoug83 restored the kernel-exception-mismatches-5900 branch June 22, 2026 18:51
@sotikoug83 sotikoug83 reopened this Jun 22, 2026
@sotikoug83 sotikoug83 marked this pull request as draft June 22, 2026 18:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants