Skip to content

[refine](column) enforce nullable nested types for array and map#62491

Open
Mryange wants to merge 2 commits intoapache:masterfrom
Mryange:enforce-nullable-nested-types-for-array-map
Open

[refine](column) enforce nullable nested types for array and map#62491
Mryange wants to merge 2 commits intoapache:masterfrom
Mryange:enforce-nullable-nested-types-for-array-map

Conversation

@Mryange
Copy link
Copy Markdown
Contributor

@Mryange Mryange commented Apr 14, 2026

What problem does this PR solve?

Problem Summary:

This PR makes the nested types inside Array and Map explicitly nullable in BE type implementations, instead of relying on implicit caller-side conventions.

  • DataTypeArray now always stores nullable nested element type
  • DataTypeMap now always stores nullable key/value types
  • DataTypeArraySerDe and DataTypeMapSerDe are updated to follow the same invariant

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Copy Markdown
Contributor

Thearas commented Apr 14, 2026

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@Mryange
Copy link
Copy Markdown
Contributor Author

Mryange commented Apr 14, 2026

run buildall

@Mryange
Copy link
Copy Markdown
Contributor Author

Mryange commented Apr 14, 2026

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found one blocking issue.

  1. be/src/core/data_type/data_type_array.cpp: forcing every array child through make_nullable() changes the FE-visible type metadata that BE exports. DataTypeArray::to_protobuf() now always reports contains_null=true, and FE's FoldConstantRuleOnBE.convertToNereidsType() reconstructs Nereids ArrayType from that flag. That means BE constant folding can no longer round-trip ARRAY<... NOT NULL> precisely. A concrete case is CAST([1] AS ARRAY<INT NOT NULL>): after folding, FE will see ARRAY<INT> instead of the original non-null-child type. FE still treats ArrayType.containsNull as semantically significant in exact matching (fe/fe-type/.../ArrayType.java, Type.matchExactType()), so this is a real behavior regression, not just an internal refactor.

Critical checkpoint conclusions:

  • Goal of the task: standardize BE array/map nested types as nullable. The code mostly does that, but it does not preserve the existing FE-visible array child nullability contract on the BE->FE constant-folding path, so the goal is not achieved safely end-to-end.
  • Minimality/focus: the patch is small and focused, but it misses one required follow-through path (PTypeDesc export / FE reconstruction).
  • Concurrency: no concurrency concerns in the touched code.
  • Lifecycle/static init: no special lifecycle concerns.
  • Config changes: none.
  • Compatibility: this changes transmitted complex-type metadata behavior for arrays and is not compatibility-safe as written.
  • Parallel code paths: I checked the related SerDe/cast/type-factory paths. The array protobuf path is the one that remains inconsistent with FE expectations.
  • Special condition checks: the new DORIS_CHECK assertions are reasonable for invariant enforcement.
  • Test coverage: insufficient. There is no regression test covering FE<->BE round-trip of ARRAY<... NOT NULL> through BE constant folding or protobuf type export.
  • Test result changes: none.
  • Observability: not needed for this refactor.
  • Transaction/persistence/data-write concerns: none.
  • FE/BE variable passing: no new fields, but existing transmitted type metadata is affected and not preserved correctly.
  • Performance: no material issue identified.
  • Other issues: none beyond the blocking metadata regression above.

Because of the array type-metadata regression, I am requesting changes rather than approving.


DataTypeArray::DataTypeArray(const DataTypePtr& nested_) : nested {nested_} {}
DataTypeArray::DataTypeArray(const DataTypePtr& nested_) {
DataTypePtr nullable_nested = make_nullable(nested_);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrapping every array child in make_nullable() changes the type metadata you export later. DataTypeArray::to_protobuf() uses nested->is_nullable(), and FE's FoldConstantRuleOnBE.convertToNereidsType() reconstructs ArrayType from that flag. After this change, a folded expression like CAST([1] AS ARRAY<INT NOT NULL>) will round-trip back to FE as ARRAY<INT> because BE now always reports contains_null=true. FE still treats ArrayType.containsNull as part of exact type matching, so this is a real regression in the constant-folding path, not just an internal invariant cleanup. Please preserve the original child-nullability flag on the exported type metadata path or add a compatibility fix/test for it.

@Mryange
Copy link
Copy Markdown
Contributor Author

Mryange commented Apr 14, 2026

run buildall

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants