Skip to content

[AURON #2273] Fallback Hudi incremental queries from native scan conversion#2274

Open
weimingdiit wants to merge 1 commit into
apache:masterfrom
weimingdiit:feat/hudi-incremental-query-fallback
Open

[AURON #2273] Fallback Hudi incremental queries from native scan conversion#2274
weimingdiit wants to merge 1 commit into
apache:masterfrom
weimingdiit:feat/hudi-incremental-query-fallback

Conversation

@weimingdiit
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Closes #2273

Rationale for this change

Hudi incremental queries depend on Hudi timeline and instant filtering semantics. Native Hudi scan currently converts supported Hudi file scans to native Parquet/ORC scans, but native scan does not implement incremental query semantics. These queries should fallback to Spark/Hudi until incremental scan support is explicitly added.

What changes are included in this PR?

  • Detect incremental Hudi query options in HudiScanSupport.
  • Fallback native scan conversion when:
    • hoodie.datasource.query.type=incremental
    • hoodie.datasource.read.begin.instanttime
    • hoodie.datasource.read.end.instanttime
  • Add unit coverage for incremental query options, including case-insensitive option keys.

Are there any user-facing changes?

Yes. Hudi incremental queries will stay on Spark/Hudi scan instead of being converted to native scan.

How was this patch tested?

  • Added unit coverage in HudiScanSupportSuite.

@weimingdiit weimingdiit marked this pull request as ready for review May 16, 2026 04:58
assert(
!HudiScanSupport.isSupported(
fileFormatName,
cowOptions + ("Hoodie.DataSource.Read.Begin.InstantTime" -> "20240101010101")))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a case for the legacy hoodie.datasource.view.type=incremental key — queryTypeFromOptions checks both keys, so it already works, but an explicit assertion would document the behavior:
assert(!HudiScanSupport.isSupported(fileFormatName, cowOptions + ("hoodie.datasource.view.type" -> "incremental")))

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. I added an explicit assertion for the legacy hoodie.datasource.view.type=incremental option to document that it also falls back from native scan conversion.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR prevents Hudi incremental queries from being converted to native file scans, preserving Hudi timeline and instant filtering semantics until native support exists.

Changes:

  • Adds detection for Hudi incremental query type and begin/end instant options.
  • Falls back from native scan support when incremental options are present.
  • Adds unit coverage for incremental fallback, including case-insensitive option keys.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
thirdparty/auron-hudi/src/main/scala/org/apache/spark/sql/auron/hudi/HudiScanSupport.scala Adds incremental query option detection and rejects native scan support for those reads.
thirdparty/auron-hudi/src/test/scala/org/apache/spark/sql/auron/hudi/HudiScanSupportSuite.scala Adds assertions covering incremental query and instant option fallback behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…n conversion

Signed-off-by: weimingdiit <weimingdiit@gmail.com>
@weimingdiit weimingdiit force-pushed the feat/hudi-incremental-query-fallback branch from 52b175d to 8893f6b Compare May 18, 2026 12:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fallback Hudi incremental queries from native scan conversion

3 participants