Skip to content

feat(external): support Hive-style partitioned Parquet external tables#24321

Open
iamlinjunhong wants to merge 4 commits intomatrixorigin:mainfrom
iamlinjunhong:mp-hive
Open

feat(external): support Hive-style partitioned Parquet external tables#24321
iamlinjunhong wants to merge 4 commits intomatrixorigin:mainfrom
iamlinjunhong:mp-hive

Conversation

@iamlinjunhong
Copy link
Copy Markdown
Contributor

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) this PR fixes:

issue #24320

What this PR does / why we need it:

support Hive-style partitioned Parquet external tables

@qodo-code-review
Copy link
Copy Markdown

ⓘ You've reached your Qodo monthly free-tier limit. Reviews pause until next month — upgrade your plan to continue now, or link your paid account if you already have one.

@matrix-meow matrix-meow added the size/XXL Denotes a PR that changes 2000+ lines label May 9, 2026
@iamlinjunhong
Copy link
Copy Markdown
Contributor Author

Solid feature — the test coverage (hive_partition_test.go, hive_partition_coverage_test.go, BVT) is unusually thorough and the partition-pruning hookup into the optimizer is well-isolated. A few things worth addressing before merge; none are blockers.

Correctness / UX

1. __HIVE_DEFAULT_PARTITION__ unconditionally mapped to NULL (hive_partition_fill.go, the if strVal == HiveDefaultPartition block). The check fires for every column type, including VARCHAR / TEXT. A user whose string partition legitimately contains that literal token (e.g., data imported from a system that didn't treat it as a sentinel) will see those rows silently NULL-ified. Hive/Spark do the same thing globally, so this is defensible — but worth calling out in user docs and/or restricting the sentinel semantics to non-string types where the literal string could never otherwise be the partition value.

2. Zero-padded numeric partitions don't equate to unpadded predicates (matchPartitionValue in hive_partition.go). Directory month=01 compared against WHERE month=1 — if the partition column is typed INT, both coerce to 1 and prune correctly. But if typed VARCHAR, "01" != "1" and pruning misses. Test fixture zero_pad/ uses INT so this is fine in practice; worth a test case with VARCHAR to lock the behavior in both directions.

3. NOT NULL DEFAULT + __HIVE_DEFAULT_PARTITION__ error message shows the relative path but not the column default. Minor UX: the user sees "NOT NULL but directory has HIVE_DEFAULT_PARTITION" and has to guess whether to add NULL default or rename the directory. A short suggestion in the error message would help.

Hardening

4. URL-decoding path is dead code. hive_partition.go:~265 hard-rejects any directory name containing % before listing descends into it. ParseHivePartitionSegment calls url.PathUnescape on the value portion, but the guard above ensures no % ever reaches that code. Either remove the guard and let the URL-decoding actually work (the url_encoded fixture is labeled "P0 known limitation" — escalating it to P1 would be a nice follow-up), or remove the PathUnescape and comment that URL-encoding is unsupported. Having both is confusing.

5. Partition pruning only handles = and IN (tryExtractPredicate, hive_partition.go). <, <=, >, >=, BETWEEN, NOT IN all fall through to row-level filtering. Not a correctness issue (the predicate is also applied row-side), but the perf benefit of partitioning vanishes for range queries — which is exactly what partitions are usually for on date/year columns. Worth tracking as a follow-up.

6. collectBareColNames only recurses into Expr_F (hive_partition.go). Expressions wrapped in CAST(...) or inside CASE / subqueries don't surface their referenced columns, so pruning is silently skipped. A predicate like CAST(year AS INT) = 2024 won't prune. The row-level filter still catches it, so it's conservative — but the pruning surface is narrower than the design doc implies. Consider walking through CAST and CASE-WHEN nodes.

7. Serial directory listing, 10k list-call cap (DiscoverHivePartitions). On S3 at ~100ms/list a table with 5000 partitions takes ~8 min to plan. Not a bug — safe by design — but worth a bounded concurrent list (e.g., errgroup.Group with SetLimit(8)) as a follow-up. The TODO at hive_partition.go:~91 about re-invoking GetForETLWithType per recursion compounds this; caching the FS client across the recursion would help.

8. validateLiteralVecBinary hand-parses vector.Vec.MarshalBinary layout (hive_partition.go:~731). The format isn't versioned. If the vector binary layout changes, this silently returns false and partition pruning stops working with no observable error. Replace with vec.UnmarshalBinary + Length check.

9. parseHiveOptionKV defensive skip-guards against legacy JSON — if a legacy JSON has hive_partitioning=false but hive_partition_columns=year, the table ends up with columns but pruning disabled. No consumer guards against this inconsistent state (refreshPartitionValues checks HivePartitioning only). Consider rejecting the combination in validateAndSetHivePartitionOptions to fail fast.

Verified correct

  • Parser change (update.go, 14 lines): ExternParam gets HivePartitioning, HivePartitionCols, HivePartitionColTypes (with a local HivePartColType to avoid importing pkg/pb/plan). Plumbed end-to-end through build/compile/exec. Reasonable boundary.
  • Concurrency (types.go, 11 lines): new per-ExternalParam state is read/written by a single reader goroutine because param.Parallel = false is forced for hive in compile.go. Safe as long as that enforcement holds — worth a comment on each new field noting the Parallel=false dependency.
  • Hidden files: IsHiddenFile correctly skips _SUCCESS and .crc; non-parquet regular files are silently filtered. Consistent with Hive convention.
  • Path boundary check: ExtractPartitionValues normalizes via path.Clean then enforces HasPrefix(filePath, basePath+"/"). path.Clean collapses .. before the check, so basePath/../etc/passwd/etc/passwd fails the prefix test. Consistent with non-hive external-table path handling.

done

Copy link
Copy Markdown
Contributor

@aunjgr aunjgr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up review. Several of my earlier concerns are resolved; a few remain.

Resolved (both #24321 and #24329)

  • URL-decoding dead code — PathUnescape removed; % in directory names rejected with an explicit error message.
  • collectBareColNames — CAST/CASE are Expr_F nodes, so they're now reached via the existing walk. Subqueries (Expr_Sub) still aren't walked but fall through to row filters (conservatively safe).
  • validateLiteralVecBinary replaced with vectorBinaryEnvelopeInBounds — bounds-only check, actual decode delegates to vector.UnmarshalBinary with Oid/Length post-check.
  • parseHiveOptionKV legacy-JSON — now documented with per-key skip-guards and regression comment.
  • NOT NULL DEFAULT + __HIVE_DEFAULT_PARTITION__ error message — now actionable.

Still pending

  • __HIVE_DEFAULT_PARTITION__ for VARCHAR columns (hive_partition_fill.go) — still unconditionally NULLs regardless of column type. String values that legitimately contain that literal are silently lost.
  • Zero-padded non-INT — INT works via numeric compare (tested); VARCHAR month='01' vs WHERE month='1' still won't prune. No test or note for the VARCHAR case.
  • Range predicate pruning (<, <=, >, >=, BETWEEN, NOT IN) — still not implemented; falls through to row-level filters. Acknowledged in SQL test comments as P0 scope.
  • Serial directory listingdiscoverRecursive single-threaded, maxListCalls=10000 cap preserved. Acceptable for P0.

Important: #24321 vs #24329

I notice you opened #24329 with the same feature and a different file split (adds pkg/fileservice/path.go, drops pkg/sql/colexec/external/reader_parquet.go).

#24329 will regress behavior on the current tree. On main since commit a2ac9f22f (Feb 2026), scanParquetFile no longer exists; Parquet is dispatched through NewParquetReader in reader_parquet.go (external.go:153). The #24329 diff:

  • Places refreshPartitionValues / fillVirtualColumns / rowCountOnly logic inside scanParquetFile, which is gone post-rebase.
  • Deletes reader_parquet.go, which breaks external.go:153 and parquet_test.go:895,1572.

Net: after rebase, Hive virtual-column fill would never execute, and the build would break.

#24321 avoids this by editing both scanParquetFile and reader_parquet.go; only its reader_parquet.go edits are load-bearing post-rebase.

The fileservice/path.go change in #24329 (adding = to allowed chars) is already on main via #24021 (8bb4c4e05), so that edit is a no-op conflict.

Recommendation: merge this PR (#24321) and close #24329. If you want the path_test.go additions from #24329, cherry-pick the test into this PR.

Copy link
Copy Markdown
Contributor

@aunjgr aunjgr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upgrading my prior comment to approve. On reflection the three pending items don't warrant blocking:

  • __HIVE_DEFAULT_PARTITION__ for VARCHAR — silently NULLs a literal that happens to match the Hive sentinel. Consistent with Hive/Spark semantics. Please document this in user-facing docs and file a follow-up to scope the sentinel to non-string types only.
  • Zero-padded VARCHAR — conservative miss in pruning, not a correctness issue. Row-level filter still runs. Follow-up.
  • Range-predicate pruning (<, <=, >, >=, BETWEEN, NOT IN) — perf only. Follow-up.

Please still address my comment on #24329 before landing this one — same feature shouldn't ship twice.

LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/XXL Denotes a PR that changes 2000+ lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants