[WIP][Spark][kernel] Translate Kernel exceptions to DSv1 Spark/Delta exceptions #5900#7025
[WIP][Spark][kernel] Translate Kernel exceptions to DSv1 Spark/Delta exceptions #5900#7025sotikoug83 wants to merge 1 commit into
Conversation
…tions Signed-off-by: Sotirios Kougiouris <sotirisjr@kougiouris.org>
44d7a82 to
8625c28
Compare
huan233usc
left a comment
There was a problem hiding this comment.
[AI Generated] Left two correctness comments around preserving DSv1 exception behavior for missing paths and protocol-version errors.
| UnsupportedProtocolVersionException source = (UnsupportedProtocolVersionException) e; | ||
| boolean isReader = | ||
| source.getVersionType() == UnsupportedProtocolVersionException.ProtocolVersionType.READER; | ||
| return new InvalidProtocolVersionException( |
There was a problem hiding this comment.
[AI Generated] Passing 0 for whichever protocol side Kernel did not flag makes the Spark/Delta error parameters inaccurate. DSv1 constructs InvalidProtocolVersionException from the full table protocol, so an unsupported reader protocol still reports the table's actual minWriterVersion in the user-facing DELTA_INVALID_PROTOCOL_VERSION message. Can we carry both required protocol versions through Kernel, or avoid this translation until the converter can preserve both values? The unit tests should also assert both readerRequiredVersion and writerRequiredVersion so this does not regress silently.
| this.initialSnapshot = snapshotManager.loadLatestSnapshot(); | ||
| Snapshot loadedSnapshot; | ||
| try { | ||
| loadedSnapshot = snapshotManager.loadLatestSnapshot(); |
There was a problem hiding this comment.
[AI Generated] DeltaV2Table is shared by batch and streaming reads, but this TABLE_RESOLUTION path maps every Kernel TableNotFoundException to schemaNotSetException. Kernel also throws TableNotFoundException when _delta_log is missing, while DSv1 batch reads still surface DELTA_PATH_DOES_NOT_EXIST for a missing path. Can we distinguish the batch vs streaming/table-state cases here, or add a missing-path batch regression test under STRICT mode? Otherwise this fixes the empty-log streaming test by changing the missing-path batch error.
Which Delta project/connector is this regarding?
Description
(This PR was originally meant to address #5900, but was scoped down to #7071. This PR is left as a potential general prototype/PoC solution to the Kernel exception mismatches)
The Kernel-backed DSv2 connector currently surfaces raw
io.delta.kernel.exceptions.*exceptions where the DSv1 connector throws Spark/Delta exceptions. Two known mismatches:_delta_logstreaming read throws Kernel'sTableNotFoundExceptioninstead ofAnalysisException("Table schema is not set ... CREATE TABLE").UnsupportedProtocolVersionExceptioninstead oforg.apache.spark.sql.delta.InvalidProtocolVersionException.This PR adds an extensible translation layer in a new
io.delta.spark.internal.v2.exceptionpackage:KernelExceptionConverter: a registry (Map<Class, Handler>), mapping anotherKernelExceptionsubclass is a one-lineHANDLERS.put(...)addition.Operation: handlers receive the table path and the operation, so one Kernel exception can translate differently per context.How was this patch tested?
New unit tests for the converter:
Moved the two previously failing tests ("no schema should throw an exception" and "Delta sources should verify the protocol reader version") from
FailingTeststoPassingTestsinDeltaV2SourceSuiteand ran the full V2 suite to check for regressions.Does this PR introduce any user-facing changes?
Only under
STRICTmode: exception types for the two scenarios above change from Kernel exceptions to the same Spark/Delta exceptions the DSv1 connector throws.