Skip to content

fix: refine data order semantics#35296

Open
facetosea wants to merge 1 commit into
3.0from
feat/data-order-level-v1
Open

fix: refine data order semantics#35296
facetosea wants to merge 1 commit into
3.0from
feat/data-order-level-v1

Conversation

@facetosea
Copy link
Copy Markdown
Contributor

  • propagate data order level into physical plan and executor
  • align interval and table merge scan order semantics
  • add interval and external regression cases

Description

Issue(s)

  • Close/close/Fix/fix/Resolve/resolve: Issue Link

Checklist

Please check the items in the checklist if applicable.

  • Is the user manual updated?
  • Are the test cases passed and automated?
  • Is there no significant decrease in test coverage?

- propagate data order level into physical plan and executor
- align interval and table merge scan order semantics
- add interval and external regression cases

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 8, 2026 09:02
@facetosea facetosea requested review from a team, dapan1121 and guanshengliang as code owners May 8, 2026 09:02
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces granular data ordering tracking by adding requireDataOrder and resultDataOrder fields to physical and logical nodes. It refactors the executor to use helper functions like setOptrBasicInfoOrder and optrHasOrderedInput for managing these order levels, replacing manual assignments across various operators. Additionally, it implements logic to promote scans to table merges based on required data order levels and includes regression tests for interval and external windows. Feedback was provided regarding the inconsistent use of ordering constants (ORDER_ASC/ORDER_DESC vs TSDB_ORDER_ASC) within the new helper functions.

Comment on lines +535 to +543
static FORCE_INLINE bool optrHasOrderedInput(const SOptrBasicInfo* pInfo, EDataOrderLevel minLevel) {
return (pInfo->inputTsOrder == ORDER_ASC || pInfo->inputTsOrder == ORDER_DESC) &&
pInfo->inputDataOrder >= minLevel;
}

static FORCE_INLINE bool optrHasOrderedOutput(const SOptrBasicInfo* pInfo, EDataOrderLevel minLevel) {
return (pInfo->outputTsOrder == ORDER_ASC || pInfo->outputTsOrder == ORDER_DESC) &&
pInfo->outputDataOrder >= minLevel;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The helper functions optrHasOrderedInput and optrHasOrderedOutput use ORDER_ASC and ORDER_DESC (likely from the EOrder enum), while other parts of the executor (e.g., timewindowoperator.c:787, externalwindowoperator.c:1647) use TSDB_ORDER_ASC. Mixing these different constant sets for the same semantic meaning can lead to subtle bugs if their values ever diverge and makes the code harder to maintain. It is recommended to use a consistent set of constants throughout the executor layer.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refines “data order level” semantics across planning and execution by propagating requireDataOrder/resultDataOrder into the physical plan and using those properties in executor ordering decisions, aiming to align interval and external-window behaviors under multi-table / partially-ordered inputs.

Changes:

  • Propagate data-order level fields from logic nodes into physical nodes, including clone/JSON/TLV serialization support.
  • Centralize “promote scan to table-merge scan” behavior and align scan/window order semantics (notably interval + external window).
  • Add regression tests covering interval nesting and external_window multi-table unordered fallback paths.

Reviewed changes

Copilot reviewed 25 out of 25 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
test/cases/13-TimeSeriesExt/08-ExternalWindow/test_external.py Adds regression ensuring multi-table external_window remains correct when upstream isn’t treated as globally ordered.
test/cases/13-TimeSeriesExt/03-TimeWindow/test_interval_bugfix.py Adds regression for nested interval windows over multi-table/group-ordered inner results.
source/libs/planner/src/planUtil.c Adds planPromoteScanToTableMerge() helper and uses it when adjusting scan requirements.
source/libs/planner/src/planSpliter.c Uses planPromoteScanToTableMerge() when constructing/marking merge scans.
source/libs/planner/src/planPhysiCreater.c Copies requireDataOrder/resultDataOrder from logic nodes into physical nodes.
source/libs/planner/src/planOptimizer.c Uses planPromoteScanToTableMerge() in optimizations that force merge scans.
source/libs/planner/src/planLogicCreater.c Refines interval initial order semantics and recalculates external-window inputHasOrder from upstream properties.
source/libs/planner/inc/planInt.h Exposes planPromoteScanToTableMerge() to planner modules.
source/libs/nodes/src/nodesMsgFuncs.c Adds TLV encode/decode for physical node data-order fields.
source/libs/nodes/src/nodesCodeFuncs.c Adds JSON encode/decode for physical node data-order fields.
source/libs/nodes/src/nodesCloneFuncs.c Ensures physical node cloning copies new data-order fields.
source/libs/executor/src/virtualtablescanoperator.c Initializes operator basic info order via a shared helper.
source/libs/executor/src/timewindowoperator.c Uses data-order-aware predicate (optrHasOrderedInput) and shared order init helper.
source/libs/executor/src/streamexternalwindowoperator.c Uses shared helper to initialize operator order + data-order fields.
source/libs/executor/src/sortoperator.c Uses shared helper to initialize operator order + data-order fields.
source/libs/executor/src/projectoperator.c Uses shared helper to initialize operator order + data-order fields.
source/libs/executor/src/mergeoperator.c Uses shared helper to initialize operator order + data-order fields.
source/libs/executor/src/groupoperator.c Uses shared helper to initialize operator order + data-order fields.
source/libs/executor/src/externalwindowoperator.c Uses shared helper to initialize operator order + data-order fields.
source/libs/executor/src/eventwindowoperator.c Uses shared helper to initialize operator order + data-order fields.
source/libs/executor/src/countwindowoperator.c Uses shared helper to initialize operator order + data-order fields.
source/libs/executor/src/anomalywindowoperator.c Uses shared helper to initialize operator order + data-order fields.
source/libs/executor/src/aggregateoperator.c Uses shared helper to initialize operator order + data-order fields.
source/libs/executor/inc/executorInt.h Extends SOptrBasicInfo with data-order levels and adds shared helper predicates/utilities.
include/libs/nodes/plannodes.h Extends SPhysiNode with requireDataOrder/resultDataOrder.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 2327 to 2339
enum {
PHY_NODE_CODE_OUTPUT_DESC = 1,
PHY_NODE_CODE_CONDITIONS,
PHY_NODE_CODE_CHILDREN,
PHY_NODE_CODE_LIMIT,
PHY_NODE_CODE_SLIMIT,
PHY_NODE_CODE_INPUT_TS_ORDER,
PHY_NODE_CODE_OUTPUT_TS_ORDER,
PHY_NODE_CODE_REQUIRE_DATA_ORDER,
PHY_NODE_CODE_RESULT_DATA_ORDER,
PHY_NODE_CODE_DYNAMIC_OP,
PHY_NODE_CODE_FORCE_NONBLOCKING_OPTR
};
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants