fix: refine data order semantics#35296
Conversation
- propagate data order level into physical plan and executor - align interval and table merge scan order semantics - add interval and external regression cases Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Code Review
This pull request introduces granular data ordering tracking by adding requireDataOrder and resultDataOrder fields to physical and logical nodes. It refactors the executor to use helper functions like setOptrBasicInfoOrder and optrHasOrderedInput for managing these order levels, replacing manual assignments across various operators. Additionally, it implements logic to promote scans to table merges based on required data order levels and includes regression tests for interval and external windows. Feedback was provided regarding the inconsistent use of ordering constants (ORDER_ASC/ORDER_DESC vs TSDB_ORDER_ASC) within the new helper functions.
| static FORCE_INLINE bool optrHasOrderedInput(const SOptrBasicInfo* pInfo, EDataOrderLevel minLevel) { | ||
| return (pInfo->inputTsOrder == ORDER_ASC || pInfo->inputTsOrder == ORDER_DESC) && | ||
| pInfo->inputDataOrder >= minLevel; | ||
| } | ||
|
|
||
| static FORCE_INLINE bool optrHasOrderedOutput(const SOptrBasicInfo* pInfo, EDataOrderLevel minLevel) { | ||
| return (pInfo->outputTsOrder == ORDER_ASC || pInfo->outputTsOrder == ORDER_DESC) && | ||
| pInfo->outputDataOrder >= minLevel; | ||
| } |
There was a problem hiding this comment.
The helper functions optrHasOrderedInput and optrHasOrderedOutput use ORDER_ASC and ORDER_DESC (likely from the EOrder enum), while other parts of the executor (e.g., timewindowoperator.c:787, externalwindowoperator.c:1647) use TSDB_ORDER_ASC. Mixing these different constant sets for the same semantic meaning can lead to subtle bugs if their values ever diverge and makes the code harder to maintain. It is recommended to use a consistent set of constants throughout the executor layer.
There was a problem hiding this comment.
Pull request overview
This PR refines “data order level” semantics across planning and execution by propagating requireDataOrder/resultDataOrder into the physical plan and using those properties in executor ordering decisions, aiming to align interval and external-window behaviors under multi-table / partially-ordered inputs.
Changes:
- Propagate data-order level fields from logic nodes into physical nodes, including clone/JSON/TLV serialization support.
- Centralize “promote scan to table-merge scan” behavior and align scan/window order semantics (notably interval + external window).
- Add regression tests covering interval nesting and external_window multi-table unordered fallback paths.
Reviewed changes
Copilot reviewed 25 out of 25 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| test/cases/13-TimeSeriesExt/08-ExternalWindow/test_external.py | Adds regression ensuring multi-table external_window remains correct when upstream isn’t treated as globally ordered. |
| test/cases/13-TimeSeriesExt/03-TimeWindow/test_interval_bugfix.py | Adds regression for nested interval windows over multi-table/group-ordered inner results. |
| source/libs/planner/src/planUtil.c | Adds planPromoteScanToTableMerge() helper and uses it when adjusting scan requirements. |
| source/libs/planner/src/planSpliter.c | Uses planPromoteScanToTableMerge() when constructing/marking merge scans. |
| source/libs/planner/src/planPhysiCreater.c | Copies requireDataOrder/resultDataOrder from logic nodes into physical nodes. |
| source/libs/planner/src/planOptimizer.c | Uses planPromoteScanToTableMerge() in optimizations that force merge scans. |
| source/libs/planner/src/planLogicCreater.c | Refines interval initial order semantics and recalculates external-window inputHasOrder from upstream properties. |
| source/libs/planner/inc/planInt.h | Exposes planPromoteScanToTableMerge() to planner modules. |
| source/libs/nodes/src/nodesMsgFuncs.c | Adds TLV encode/decode for physical node data-order fields. |
| source/libs/nodes/src/nodesCodeFuncs.c | Adds JSON encode/decode for physical node data-order fields. |
| source/libs/nodes/src/nodesCloneFuncs.c | Ensures physical node cloning copies new data-order fields. |
| source/libs/executor/src/virtualtablescanoperator.c | Initializes operator basic info order via a shared helper. |
| source/libs/executor/src/timewindowoperator.c | Uses data-order-aware predicate (optrHasOrderedInput) and shared order init helper. |
| source/libs/executor/src/streamexternalwindowoperator.c | Uses shared helper to initialize operator order + data-order fields. |
| source/libs/executor/src/sortoperator.c | Uses shared helper to initialize operator order + data-order fields. |
| source/libs/executor/src/projectoperator.c | Uses shared helper to initialize operator order + data-order fields. |
| source/libs/executor/src/mergeoperator.c | Uses shared helper to initialize operator order + data-order fields. |
| source/libs/executor/src/groupoperator.c | Uses shared helper to initialize operator order + data-order fields. |
| source/libs/executor/src/externalwindowoperator.c | Uses shared helper to initialize operator order + data-order fields. |
| source/libs/executor/src/eventwindowoperator.c | Uses shared helper to initialize operator order + data-order fields. |
| source/libs/executor/src/countwindowoperator.c | Uses shared helper to initialize operator order + data-order fields. |
| source/libs/executor/src/anomalywindowoperator.c | Uses shared helper to initialize operator order + data-order fields. |
| source/libs/executor/src/aggregateoperator.c | Uses shared helper to initialize operator order + data-order fields. |
| source/libs/executor/inc/executorInt.h | Extends SOptrBasicInfo with data-order levels and adds shared helper predicates/utilities. |
| include/libs/nodes/plannodes.h | Extends SPhysiNode with requireDataOrder/resultDataOrder. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| enum { | ||
| PHY_NODE_CODE_OUTPUT_DESC = 1, | ||
| PHY_NODE_CODE_CONDITIONS, | ||
| PHY_NODE_CODE_CHILDREN, | ||
| PHY_NODE_CODE_LIMIT, | ||
| PHY_NODE_CODE_SLIMIT, | ||
| PHY_NODE_CODE_INPUT_TS_ORDER, | ||
| PHY_NODE_CODE_OUTPUT_TS_ORDER, | ||
| PHY_NODE_CODE_REQUIRE_DATA_ORDER, | ||
| PHY_NODE_CODE_RESULT_DATA_ORDER, | ||
| PHY_NODE_CODE_DYNAMIC_OP, | ||
| PHY_NODE_CODE_FORCE_NONBLOCKING_OPTR | ||
| }; |
Description
Issue(s)
Checklist
Please check the items in the checklist if applicable.