diff --git a/docs-mintlify/docs/data-modeling/multi-fact-views.mdx b/docs-mintlify/docs/data-modeling/multi-fact-views.mdx index bd9634615ab87..d957b1d535f27 100644 --- a/docs-mintlify/docs/data-modeling/multi-fact-views.mdx +++ b/docs-mintlify/docs/data-modeling/multi-fact-views.mdx @@ -310,6 +310,130 @@ The combined result shows measures from each fact table side by side: Charlie has no orders and Diana has no returns — both are still included with `NULL` values for the missing fact table. +## Joining views in the SQL API + +You don't have to define a dedicated multi-fact view to get multi-fact +behavior. The [SQL API][ref-sql-api] produces the same query when you **join +two or more views on a dimension they share** and group by that dimension. + +Suppose `orders_view` and `returns_view` are two separate views that each +expose the customer's `name` (both backed by the same underlying +`customers.name` member). Joining them on `name` and grouping by it triggers a +multi-fact query: + +```sql +SELECT + o.name, + MEASURE(o.total_amount), + MEASURE(r.total_refund) +FROM orders_view o +LEFT JOIN returns_view r ON r.name = o.name +GROUP BY 1 +``` + +Cube recognizes that both `name` columns resolve to the same cube member, +merges the two view scans into a single multi-fact query, and runs it with the +separate-subquery-then-join strategy described +[above](#what-cube-does-under-the-hood). + +This rewrite applies only when: + +- The Tesseract SQL planner is enabled via + [`CUBEJS_TESSERACT_SQL_PLANNER`][ref-tesseract-env]. +- Both sides of the join condition resolve to the **same underlying cube + member** (a shared dimension), and the join key is composed only of + dimensions. +- The query is **grouped by the join key** — every grouped dimension is the + shared join key. Ungrouped joins (such as `SELECT *`) and queries that group + by a different dimension are not merged and fall back to standard join + handling. + +### Joining three or more views + +The rewrite is not limited to two views. Chained joins on the same shared key +are merged into a single multi-fact query, with each view contributing its own +aggregating subquery: + +```sql +SELECT + o.name, + MEASURE(o.total_amount), + MEASURE(r.total_refund), + MEASURE(p.total_paid) +FROM orders_view o +FULL JOIN returns_view r ON r.name = o.name +FULL JOIN payments_view p ON p.name = o.name +GROUP BY 1 +``` + +### Joining on a time dimension + +A common multi-fact pattern joins facts on a shared time dimension and groups by +a truncated grain. **Join on `DATE_TRUNC` at the same granularity you group by:** + +```sql +SELECT DATE_TRUNC('day', o.created_at), MEASURE(o.total_amount), MEASURE(r.total_refund) +FROM orders_view o +JOIN returns_view r ON DATE_TRUNC('day', r.created_at) = DATE_TRUNC('day', o.created_at) +GROUP BY 1 +``` + +The grouped column is emitted as a time dimension with its granularity. A join +written on `DATE_TRUNC` is an `INNER` join (the SQL planner expresses it as a +filtered cross join), so both sides must share a key; both truncated columns +must resolve to the same underlying time member at the same granularity. + +The join-key granularity must match the `GROUP BY` granularity, because the +facts are stitched together at the grain you group by. This has two +consequences: + +- Joining on `DATE_TRUNC('month', …)` while grouping by `DATE_TRUNC('day', …)` + is not merged (it would silently stitch at day grain, diverging from the + month-grain join). +- Joining on the **raw** time column (`ON r.created_at = o.created_at`, an + exact-timestamp join) while grouping by `DATE_TRUNC('day', …)` is likewise not + merged — the row-grain join doesn't match the day-grain group-by. Truncate the + join key to the grain you group by instead. + +In both cases the query falls back to standard join handling. + +You can also combine a `DATE_TRUNC` equality with a plain dimension equality in +the same join (a composite key), and group by both: + +```sql +SELECT DATE_TRUNC('day', o.created_at), o.name, MEASURE(o.total_amount), MEASURE(r.total_refund) +FROM orders_view o +JOIN returns_view r + ON DATE_TRUNC('day', r.created_at) = DATE_TRUNC('day', o.created_at) + AND r.name = o.name +GROUP BY 1, 2 +``` + +### Filtering the join + +Filters on top of the join are supported and are applied to the merged query: + +- A `WHERE` clause is pushed into the merged scan. A predicate on a dimension + shared by all facts filters the whole result; a predicate on a fact-specific + dimension filters only that fact's subquery. +- A predicate in the `ON` clause that the planner can attach to a single side + (for example, a condition on the optional side of a `LEFT JOIN`) becomes a + filter on that fact. Predicates that the SQL planner can't push to one side + of an outer join (such as a left-table condition in a `LEFT JOIN ON`) aren't + supported by the planner and will raise an error. + +### Join type + +The facts are stitched together with a `FULL JOIN` on the shared key, and the +`JOIN` type in your SQL controls which rows are kept: + +| SQL join | Result | +| --- | --- | +| `FULL [OUTER] JOIN` | every key from either view (default multi-fact behavior) | +| `INNER JOIN` | only keys present in **both** views | +| `LEFT JOIN` | every key from the left view; right-side measures are `NULL` when missing | +| `RIGHT JOIN` | every key from the right view; left-side measures are `NULL` when missing | + ## Common patterns ### Time as the shared dimension @@ -417,5 +541,6 @@ to that fact's subquery. [ref-views]: /docs/data-modeling/views [ref-view-ref]: /reference/data-modeling/view [ref-segments]: /reference/data-modeling/segments +[ref-sql-api]: /reference/core-data-apis/sql-api [ref-tesseract-env]: /reference/configuration/environment-variables#cubejs_tesseract_sql_planner [link-tesseract]: https://cube.dev/blog/introducing-tesseract diff --git a/docs-mintlify/reference/core-data-apis/sql-api/joins.mdx b/docs-mintlify/reference/core-data-apis/sql-api/joins.mdx index 042c0e575dc10..48ae8a7b4935e 100644 --- a/docs-mintlify/reference/core-data-apis/sql-api/joins.mdx +++ b/docs-mintlify/reference/core-data-apis/sql-api/joins.mdx @@ -207,7 +207,33 @@ LIMIT 5; Please note that, even if `product_description` is in the inner selection, it isn't evaluated in the final query as it isn't used in any way. +## Joining views on a shared dimension + +When you join two views on a dimension that resolves to the **same underlying +cube member** and group by that dimension, Cube doesn't perform a row-level +join. Instead it merges them into a single +[multi-fact query][ref-multi-fact-views]: each view becomes its own +aggregating subquery and the results are stitched together on the shared key, +so measures from both views are combined without fan-out. + +```sql +SELECT + o.name, + MEASURE(o.total_amount), + MEASURE(r.total_refund) +FROM orders_view o +LEFT JOIN returns_view r ON r.name = o.name +GROUP BY 1 +``` + +The `JOIN` type (`INNER`, `LEFT`, `RIGHT`, `FULL`) controls which keys are +kept. This requires the [Tesseract SQL planner][ref-tesseract-env] and only +applies to grouped queries whose `GROUP BY` is the join key. See +[multi-fact views][ref-multi-fact-views] for the full explanation. + [ref-views]: /docs/data-modeling/views [ref-join-paths]: /docs/data-modeling/joins#join-paths -[ref-join-hints]: /docs/data-modeling/joins#join-hints \ No newline at end of file +[ref-join-hints]: /docs/data-modeling/joins#join-hints +[ref-multi-fact-views]: /docs/data-modeling/multi-fact-views +[ref-tesseract-env]: /reference/configuration/environment-variables#cubejs_tesseract_sql_planner \ No newline at end of file diff --git a/rust/cubesql/cubesql/src/compile/rewrite/cost.rs b/rust/cubesql/cubesql/src/compile/rewrite/cost.rs index 16ffed9991a53..1245ad047d31c 100644 --- a/rust/cubesql/cubesql/src/compile/rewrite/cost.rs +++ b/rust/cubesql/cubesql/src/compile/rewrite/cost.rs @@ -127,6 +127,7 @@ impl BestCubePlan { LogicalPlanLanguage::JoinCheckStage(_) => 1, LogicalPlanLanguage::JoinCheckPushDown(_) => 1, LogicalPlanLanguage::JoinCheckPullUp(_) => 1, + LogicalPlanLanguage::MultiFactJoinWrapper(_) => 1, LogicalPlanLanguage::SortProjectionPushdownReplacer(_) => 1, LogicalPlanLanguage::SortProjectionPullupReplacer(_) => 1, // Not really replacers but those should be deemed as mandatory rewrites and as soon as diff --git a/rust/cubesql/cubesql/src/compile/rewrite/mod.rs b/rust/cubesql/cubesql/src/compile/rewrite/mod.rs index d888ad5a9ac05..eef32945cece1 100644 --- a/rust/cubesql/cubesql/src/compile/rewrite/mod.rs +++ b/rust/cubesql/cubesql/src/compile/rewrite/mod.rs @@ -541,6 +541,18 @@ crate::plan_to_language! { left_input: Arc, right_input: Arc, }, + // Intermediate node produced while merging a join of two (view) + // CubeScans on a shared cube member into a single multi-fact CubeScan. + // `input` is the merged CubeScan; `join_members` holds the underlying + // cube members the scans were joined on, each paired with the join-key + // granularity (`Some` for a `DATE_TRUNC` time key, `None` for a plain + // dimension), so the aggregate finalize rule can verify the GROUP BY + // matches the join key at the same grain. Rewrite-only: it must be + // eliminated (unwrapped at the aggregate) before extraction. + MultiFactJoinWrapper { + input: Arc, + join_members: Vec<(String, Option)>, + }, } } @@ -2266,6 +2278,10 @@ fn cube_scan_wrapper(input: impl Display, finalized: impl Display) -> String { format!("(CubeScanWrapper {} {})", input, finalized) } +fn multi_fact_join_wrapper(input: impl Display, join_members: impl Display) -> String { + format!("(MultiFactJoinWrapper {} {})", input, join_members) +} + fn distinct(input: impl Display) -> String { format!("(Distinct {})", input) } diff --git a/rust/cubesql/cubesql/src/compile/rewrite/rules/members.rs b/rust/cubesql/cubesql/src/compile/rewrite/rules/members.rs index 1429f99e5a7f8..ebccb095aec62 100644 --- a/rust/cubesql/cubesql/src/compile/rewrite/rules/members.rs +++ b/rust/cubesql/cubesql/src/compile/rewrite/rules/members.rs @@ -7,10 +7,10 @@ use crate::{ binary_expr, cast_expr, change_user_expr, column_expr, cross_join, cube_scan, cube_scan_filters, cube_scan_filters_empty_tail, cube_scan_members, cube_scan_members_empty_tail, cube_scan_order_empty_tail, dimension_expr, distinct, - expr_column_name, fun_expr, join, like_expr, limit, list_concat_pushdown_replacer, - list_concat_pushup_replacer, literal_expr, literal_member, measure_expr, - member_pushdown_replacer, member_replacer, original_expr_name, projection, - referenced_columns, rewrite, + expr_column_name, filter, fun_expr, join, like_expr, limit, + list_concat_pushdown_replacer, list_concat_pushup_replacer, literal_expr, + literal_member, measure_expr, member_pushdown_replacer, member_replacer, + multi_fact_join_wrapper, original_expr_name, projection, referenced_columns, rewrite, rewriter::{CubeEGraph, CubeRewrite, RewriteRules}, rules::{ replacer_flat_push_down_node_substitute_rules, replacer_push_down_node, @@ -26,9 +26,10 @@ use crate::{ LikeExprLikeType, LikeExprNegated, LikeType, LimitFetch, LimitSkip, ListType, LiteralExprValue, LiteralMemberRelation, LiteralMemberValue, LogicalPlanLanguage, MeasureName, MemberErrorAliasToCube, MemberErrorError, MemberErrorPriority, - MemberPushdownReplacerAliasToCube, MemberReplacerAliasToCube, ProjectionAlias, - TableScanFetch, TableScanProjection, TableScanSourceTableName, TableScanTableName, - TimeDimensionDateRange, TimeDimensionGranularity, TimeDimensionName, + MemberPushdownReplacerAliasToCube, MemberReplacerAliasToCube, + MultiFactJoinWrapperJoinMembers, ProjectionAlias, TableScanFetch, TableScanProjection, + TableScanSourceTableName, TableScanTableName, TimeDimensionDateRange, + TimeDimensionGranularity, TimeDimensionName, }, }, config::ConfigObj, @@ -41,7 +42,7 @@ use cubeclient::models::V1CubeMetaMeasure; use datafusion::{ arrow::datatypes::DataType, logical_plan::{Column, DFSchema, Expr, Operator}, - physical_plan::aggregates::AggregateFunction, + physical_plan::{aggregates::AggregateFunction, functions::BuiltinScalarFunction}, scalar::ScalarValue, }; use egg::{Id, Subst, Var}; @@ -434,6 +435,348 @@ impl RewriteRules for MemberRules { "?out_join_hints", ), ), + // Merge a join of two (view) CubeScans on a dimension that resolves + // to the same underlying cube member into a single CubeScan wrapped + // in a MultiFactJoinWrapper. The wrapper records the join key (as + // underlying cube members) so the aggregate finalize rule can later + // require the GROUP BY to match it, and so additional joins (3+ + // views) and WHERE filters can compose before finalization. + transforming_rewrite( + "shared-member-join-to-wrapper", + join( + cube_scan( + "?left_alias_to_cube", + "?left_members", + "?left_filters", + "?left_orders", + "CubeScanLimit:None", + "CubeScanOffset:None", + "?left_split", + "CubeScanCanPushdownJoin:true", + "CubeScanWrapped:false", + "CubeScanUngrouped:true", + "?left_join_hints", + ), + cube_scan( + "?right_alias_to_cube", + "?right_members", + "?right_filters", + "?right_orders", + "CubeScanLimit:None", + "CubeScanOffset:None", + "?right_split", + "CubeScanCanPushdownJoin:true", + "CubeScanWrapped:false", + "CubeScanUngrouped:true", + "?right_join_hints", + ), + "?left_on", + "?right_on", + "?join_type", + "?join_constraint", + "?null_equals_null", + ), + multi_fact_join_wrapper( + cube_scan( + "?out_alias_to_cube", + cube_scan_members("?left_members", "?right_members"), + cube_scan_filters("?left_filters", "?right_filters"), + cube_scan_order_empty_tail(), + "CubeScanLimit:None", + "CubeScanOffset:None", + "CubeScanSplit:false", + "CubeScanCanPushdownJoin:true", + "CubeScanWrapped:false", + "CubeScanUngrouped:true", + "?out_join_hints", + ), + "?join_members", + ), + self.merge_shared_member_join( + "?left_alias_to_cube", + "?right_alias_to_cube", + "?out_alias_to_cube", + "?left_members", + "?right_members", + "?left_on", + "?right_on", + "?join_type", + "?left_join_hints", + "?right_join_hints", + "?out_join_hints", + "?left_filters", + "?join_members", + None, + ), + ), + // Extend a MultiFactJoinWrapper with another joined (view) CubeScan, + // supporting joins of 3+ views. The new join must again be on a + // dimension resolving to the same underlying member; its key is + // unioned into the wrapper's recorded join members. + // + // Only the wrapper-on-the-left shape is matched, which is the + // left-deep tree SQL parsers produce for `a JOIN b JOIN c`. A + // right-associative `a JOIN (b JOIN c)` (explicit parentheses) keeps + // the wrapper on the right and is not chained; it falls back to + // standard join handling. + transforming_rewrite( + "shared-member-join-extend-wrapper", + join( + multi_fact_join_wrapper( + cube_scan( + "?left_alias_to_cube", + "?left_members", + "?left_filters", + "?left_orders", + "CubeScanLimit:None", + "CubeScanOffset:None", + "?left_split", + "CubeScanCanPushdownJoin:true", + "CubeScanWrapped:false", + "CubeScanUngrouped:true", + "?left_join_hints", + ), + "?prev_join_members", + ), + cube_scan( + "?right_alias_to_cube", + "?right_members", + "?right_filters", + "?right_orders", + "CubeScanLimit:None", + "CubeScanOffset:None", + "?right_split", + "CubeScanCanPushdownJoin:true", + "CubeScanWrapped:false", + "CubeScanUngrouped:true", + "?right_join_hints", + ), + "?left_on", + "?right_on", + "?join_type", + "?join_constraint", + "?null_equals_null", + ), + multi_fact_join_wrapper( + cube_scan( + "?out_alias_to_cube", + cube_scan_members("?left_members", "?right_members"), + cube_scan_filters("?left_filters", "?right_filters"), + cube_scan_order_empty_tail(), + "CubeScanLimit:None", + "CubeScanOffset:None", + "CubeScanSplit:false", + "CubeScanCanPushdownJoin:true", + "CubeScanWrapped:false", + "CubeScanUngrouped:true", + "?out_join_hints", + ), + "?join_members", + ), + self.merge_shared_member_join( + "?left_alias_to_cube", + "?right_alias_to_cube", + "?out_alias_to_cube", + "?left_members", + "?right_members", + "?left_on", + "?right_on", + "?join_type", + "?left_join_hints", + "?right_join_hints", + "?out_join_hints", + "?left_filters", + "?join_members", + Some("?prev_join_members"), + ), + ), + // Merge an INNER join expressed as a date-truncated equality + // (`ON DATE_TRUNC(g, a.ts) = DATE_TRUNC(g, b.ts)`), which the SQL + // planner lowers to Filter(, CrossJoin(...)) rather than a column + // equi-join, into a single multi-fact CubeScan. Both truncated + // columns must resolve to the same underlying time member at the + // same granularity. A filtered cross join is INNER, so both keys are + // marked present. + transforming_rewrite( + "shared-time-member-cross-join-to-wrapper", + filter( + binary_expr( + self.fun_expr( + "DateTrunc", + vec![ + literal_expr("?left_granularity"), + column_expr("?left_column"), + ], + ), + "=", + self.fun_expr( + "DateTrunc", + vec![ + literal_expr("?right_granularity"), + column_expr("?right_column"), + ], + ), + ), + cross_join( + cube_scan( + "?left_alias_to_cube", + "?left_members", + "?left_filters", + "?left_orders", + "CubeScanLimit:None", + "CubeScanOffset:None", + "?left_split", + "CubeScanCanPushdownJoin:true", + "CubeScanWrapped:false", + "CubeScanUngrouped:true", + "?left_join_hints", + ), + cube_scan( + "?right_alias_to_cube", + "?right_members", + "?right_filters", + "?right_orders", + "CubeScanLimit:None", + "CubeScanOffset:None", + "?right_split", + "CubeScanCanPushdownJoin:true", + "CubeScanWrapped:false", + "CubeScanUngrouped:true", + "?right_join_hints", + ), + ), + ), + multi_fact_join_wrapper( + cube_scan( + "?out_alias_to_cube", + cube_scan_members("?left_members", "?right_members"), + cube_scan_filters("?left_filters", "?right_filters"), + cube_scan_order_empty_tail(), + "CubeScanLimit:None", + "CubeScanOffset:None", + "CubeScanSplit:false", + "CubeScanCanPushdownJoin:true", + "CubeScanWrapped:false", + "CubeScanUngrouped:true", + "?out_join_hints", + ), + "?join_members", + ), + self.merge_shared_time_cross_join( + "?left_alias_to_cube", + "?right_alias_to_cube", + "?out_alias_to_cube", + "?left_members", + "?right_members", + "?left_column", + "?left_granularity", + "?right_column", + "?right_granularity", + "?left_join_hints", + "?right_join_hints", + "?out_join_hints", + "?left_filters", + "?join_members", + ), + ), + // Absorb a date-truncated equality filter sitting on top of a + // MultiFactJoinWrapper as an additional (time) join key. This covers + // joins on a mix of a plain dimension and a DATE_TRUNC: the planner + // turns `ON a.dim = b.dim AND DATE_TRUNC(g, a.ts) = DATE_TRUNC(g, b.ts)` + // into Filter(, Join(a.dim = b.dim, ...)). The column join + // becomes the wrapper; this rule folds the truncated time member into + // the recorded join key (and marks both time columns present, since a + // post-join equality is effectively INNER on that key). + transforming_rewrite( + "multi-fact-join-wrapper-absorb-time-key", + filter( + binary_expr( + self.fun_expr( + "DateTrunc", + vec![ + literal_expr("?abs_left_granularity"), + column_expr("?abs_left_column"), + ], + ), + "=", + self.fun_expr( + "DateTrunc", + vec![ + literal_expr("?abs_right_granularity"), + column_expr("?abs_right_column"), + ], + ), + ), + multi_fact_join_wrapper( + cube_scan( + "?abs_alias_to_cube", + "?abs_members", + "?abs_filters", + "?abs_orders", + "?abs_limit", + "?abs_offset", + "?abs_split", + "?abs_can_pushdown_join", + "?abs_wrapped", + "?abs_ungrouped", + "?abs_join_hints", + ), + "?abs_prev_join_members", + ), + ), + multi_fact_join_wrapper( + cube_scan( + "?abs_alias_to_cube", + "?abs_members", + "?abs_out_filters", + "?abs_orders", + "?abs_limit", + "?abs_offset", + "?abs_split", + "?abs_can_pushdown_join", + "?abs_wrapped", + "?abs_ungrouped", + "?abs_join_hints", + ), + "?abs_join_members", + ), + self.absorb_time_key_into_wrapper( + "?abs_members", + "?abs_left_column", + "?abs_left_granularity", + "?abs_right_column", + "?abs_right_granularity", + "?abs_filters", + "?abs_out_filters", + "?abs_prev_join_members", + "?abs_join_members", + ), + ), + // Push a Filter (e.g. a WHERE on top of the join) down through the + // wrapper into the merged CubeScan, where the standard filter + // push-down rules turn it into a Cube query filter. + rewrite( + "multi-fact-join-wrapper-filter-push-down", + filter( + "?filter_expr", + multi_fact_join_wrapper("?wrapped_input", "?join_members"), + ), + multi_fact_join_wrapper(filter("?filter_expr", "?wrapped_input"), "?join_members"), + ), + // Finalize: once the query is grouped and the GROUP BY matches the + // recorded join key, drop the wrapper so the standard aggregate + // push-down turns the merged scan into a (multi-fact) CubeScan. + transforming_rewrite( + "aggregate-multi-fact-join-wrapper", + aggregate( + multi_fact_join_wrapper("?scan", "?join_members"), + "?group_expr", + "?aggr_expr", + "?agg_split", + ), + aggregate("?scan", "?group_expr", "?aggr_expr", "?agg_split"), + self.finalize_shared_member_join("?scan", "?join_members", "?group_expr"), + ), ]; rules.extend(self.member_pushdown_rules()); @@ -2866,6 +3209,729 @@ impl MemberRules { } } + #[allow(clippy::too_many_arguments)] + fn merge_shared_member_join( + &self, + left_alias_to_cube_var: &'static str, + right_alias_to_cube_var: &'static str, + out_alias_to_cube_var: &'static str, + left_members_var: &'static str, + right_members_var: &'static str, + left_on_var: &'static str, + right_on_var: &'static str, + join_type_var: &'static str, + left_join_hints_var: &'static str, + right_join_hints_var: &'static str, + out_join_hints_var: &'static str, + left_filters_var: &'static str, + join_members_var: &'static str, + prev_join_members_var: Option<&'static str>, + ) -> impl Fn(&mut CubeEGraph, &mut Subst) -> bool { + let left_alias_to_cube_var = var!(left_alias_to_cube_var); + let right_alias_to_cube_var = var!(right_alias_to_cube_var); + let out_alias_to_cube_var = var!(out_alias_to_cube_var); + let left_members_var = var!(left_members_var); + let right_members_var = var!(right_members_var); + let left_on_var = var!(left_on_var); + let right_on_var = var!(right_on_var); + let join_type_var = var!(join_type_var); + let left_join_hints_var = var!(left_join_hints_var); + let right_join_hints_var = var!(right_join_hints_var); + let out_join_hints_var = var!(out_join_hints_var); + let left_filters_var = var!(left_filters_var); + let join_members_var = var!(join_members_var); + let prev_join_members_var = prev_join_members_var.map(|v| var!(v)); + let meta_context = self.meta_context.clone(); + // Merging a view join into a single multi-fact CubeScan relies on the + // Tesseract SQL planner (it stitches the fact groups with a FULL OUTER + // JOIN over the shared key). Only enable this rewrite when Tesseract is + // enabled; the legacy planner would mis-handle the resulting query. + let enable_tesseract_sql_planner = self.config_obj.enable_tesseract_sql_planner(); + move |egraph, subst| { + if !enable_tesseract_sql_planner { + return false; + } + fn dimension_member_name( + egraph: &mut CubeEGraph, + members_id: Id, + column: &Column, + ) -> Option { + match egraph[members_id].data.find_member_by_column(column) { + Some(((_, Member::Dimension { name, .. }, _), _)) + | Some(((_, Member::TimeDimension { name, .. }, _), _)) => Some(name.clone()), + _ => None, + } + } + + let resolve_underlying = |member_name: &str| -> String { + meta_context + .find_dimension_with_name(member_name) + .and_then(|dim| dim.alias_member.clone()) + .unwrap_or_else(|| member_name.to_string()) + }; + + // The join must be on dimensions that resolve to the same + // underlying cube member on both sides (a shared key). + let left_join_ons = var_iter!(egraph[subst[left_on_var]], JoinLeftOn) + .cloned() + .collect::>(); + let right_join_ons = var_iter!(egraph[subst[right_on_var]], JoinRightOn) + .cloned() + .collect::>(); + + let mut matched: Option<(String, String, Vec, Vec)> = None; + 'pairs: for left_on in left_join_ons.iter() { + for right_on in right_join_ons.iter() { + if left_on.is_empty() || left_on.len() != right_on.len() { + continue; + } + let mut left_cube_name: Option = None; + let mut right_cube_name: Option = None; + let mut left_keys: Vec = vec![]; + let mut right_keys: Vec = vec![]; + let mut all_match = true; + for (left_column, right_column) in left_on.iter().zip(right_on.iter()) { + let Some(left_name) = + dimension_member_name(egraph, subst[left_members_var], left_column) + else { + all_match = false; + break; + }; + let Some(right_name) = + dimension_member_name(egraph, subst[right_members_var], right_column) + else { + all_match = false; + break; + }; + if resolve_underlying(&left_name) != resolve_underlying(&right_name) { + all_match = false; + break; + } + // A CubeScan can expose members from multiple cubes/views, + // so every join-key column on a given side must resolve to + // the same cube/view. Otherwise the merged join hint would + // be ambiguous and the merge is not a single shared-member + // join we can represent. + let this_left_cube = left_name.split('.').next().map(|s| s.to_string()); + let this_right_cube = right_name.split('.').next().map(|s| s.to_string()); + if left_cube_name.is_some() && left_cube_name != this_left_cube { + all_match = false; + break; + } + if right_cube_name.is_some() && right_cube_name != this_right_cube { + all_match = false; + break; + } + left_cube_name = this_left_cube; + right_cube_name = this_right_cube; + left_keys.push(left_name); + right_keys.push(right_name); + } + if all_match { + if let (Some(left_cube_name), Some(right_cube_name)) = + (left_cube_name, right_cube_name) + { + matched = + Some((left_cube_name, right_cube_name, left_keys, right_keys)); + break 'pairs; + } + } + } + } + + let Some((left_cube, right_cube, shared_left_keys, shared_right_keys)) = matched else { + return false; + }; + + // Re-introduce INNER/LEFT/RIGHT semantics on top of the FULL OUTER + // multi-fact stitch by requiring the join key of each "must be + // present" side to be set (FULL adds nothing). + let mut require_left = false; + let mut require_right = false; + if let Some(join_type) = var_list_iter!(egraph[subst[join_type_var]], JoinJoinType) + .cloned() + .next() + { + match join_type.0 { + datafusion::prelude::JoinType::Inner => { + require_left = true; + require_right = true; + } + datafusion::prelude::JoinType::Left => require_left = true, + datafusion::prelude::JoinType::Right => require_right = true, + _ => {} + } + } + + let mut presence_members: Vec = vec![]; + if require_left { + presence_members.extend(shared_left_keys.iter().cloned()); + } + if require_right { + presence_members.extend(shared_right_keys.iter().cloned()); + } + + // The join key as underlying cube members, unioned with any keys + // already recorded on the left wrapper (for chained 3+ view joins). + let mut join_member_names: Vec<(String, Option)> = shared_left_keys + .iter() + .map(|k| (resolve_underlying(k), None)) + .collect(); + if let Some(prev_var) = prev_join_members_var { + if let Some(prev) = + var_iter!(egraph[subst[prev_var]], MultiFactJoinWrapperJoinMembers).next() + { + join_member_names.extend(prev.iter().cloned()); + } + } + join_member_names.sort(); + join_member_names.dedup(); + + for left_alias_to_cube in + var_iter!(egraph[subst[left_alias_to_cube_var]], CubeScanAliasToCube) + { + for right_alias_to_cube in + var_iter!(egraph[subst[right_alias_to_cube_var]], CubeScanAliasToCube) + { + for left_join_hints in + var_iter!(egraph[subst[left_join_hints_var]], CubeScanJoinHints) + { + for right_join_hints in + var_iter!(egraph[subst[right_join_hints_var]], CubeScanJoinHints) + { + let out_alias_to_cube = CubeScanAliasToCube( + left_alias_to_cube + .iter() + .chain(right_alias_to_cube.iter()) + .cloned() + .collect(), + ); + + let out_join_hints = CubeScanJoinHints( + left_join_hints + .iter() + .chain(right_join_hints.iter()) + .cloned() + .chain(iter::once(vec![left_cube.clone(), right_cube.clone()])) + .collect(), + ); + + subst.insert( + out_alias_to_cube_var, + egraph.add(LogicalPlanLanguage::CubeScanAliasToCube( + out_alias_to_cube, + )), + ); + + subst.insert( + out_join_hints_var, + egraph.add(LogicalPlanLanguage::CubeScanJoinHints(out_join_hints)), + ); + + let join_members_id = + egraph.add(LogicalPlanLanguage::MultiFactJoinWrapperJoinMembers( + MultiFactJoinWrapperJoinMembers(join_member_names.clone()), + )); + subst.insert(join_members_var, join_members_id); + + // Add the join-semantics presence filters only once a + // concrete merge is being produced, so a `false` return + // never leaves a stale `subst` entry behind. + if !presence_members.is_empty() { + let mut acc = subst[left_filters_var]; + for name in &presence_members { + let member = + egraph.add(LogicalPlanLanguage::FilterMemberMember( + crate::compile::rewrite::FilterMemberMember( + name.clone(), + ), + )); + let op = egraph.add(LogicalPlanLanguage::FilterMemberOp( + crate::compile::rewrite::FilterMemberOp("set".to_string()), + )); + let values = + egraph.add(LogicalPlanLanguage::FilterMemberValues( + crate::compile::rewrite::FilterMemberValues(vec![]), + )); + let filter_member = + egraph.add(LogicalPlanLanguage::FilterMember([ + member, op, values, + ])); + acc = egraph.add(LogicalPlanLanguage::CubeScanFilters(vec![ + filter_member, + acc, + ])); + } + subst.insert(left_filters_var, acc); + } + + return true; + } + } + } + } + + false + } + } + + // Same merge as `merge_shared_member_join`, but the join is a date-truncated + // equality the planner lowered to Filter(CrossJoin(...)). Resolves the two + // truncated columns to time-dimension members on each side, requires the + // same underlying member at the same granularity, and produces an INNER + // multi-fact CubeScan wrapped in a MultiFactJoinWrapper. + #[allow(clippy::too_many_arguments)] + fn merge_shared_time_cross_join( + &self, + left_alias_to_cube_var: &'static str, + right_alias_to_cube_var: &'static str, + out_alias_to_cube_var: &'static str, + left_members_var: &'static str, + right_members_var: &'static str, + left_column_var: &'static str, + left_granularity_var: &'static str, + right_column_var: &'static str, + right_granularity_var: &'static str, + left_join_hints_var: &'static str, + right_join_hints_var: &'static str, + out_join_hints_var: &'static str, + left_filters_var: &'static str, + join_members_var: &'static str, + ) -> impl Fn(&mut CubeEGraph, &mut Subst) -> bool { + let left_alias_to_cube_var = var!(left_alias_to_cube_var); + let right_alias_to_cube_var = var!(right_alias_to_cube_var); + let out_alias_to_cube_var = var!(out_alias_to_cube_var); + let left_members_var = var!(left_members_var); + let right_members_var = var!(right_members_var); + let left_column_var = var!(left_column_var); + let left_granularity_var = var!(left_granularity_var); + let right_column_var = var!(right_column_var); + let right_granularity_var = var!(right_granularity_var); + let left_join_hints_var = var!(left_join_hints_var); + let right_join_hints_var = var!(right_join_hints_var); + let out_join_hints_var = var!(out_join_hints_var); + let left_filters_var = var!(left_filters_var); + let join_members_var = var!(join_members_var); + let meta_context = self.meta_context.clone(); + let enable_tesseract_sql_planner = self.config_obj.enable_tesseract_sql_planner(); + move |egraph, subst| { + if !enable_tesseract_sql_planner { + return false; + } + fn dimension_member_name( + egraph: &mut CubeEGraph, + members_id: Id, + column: &Column, + ) -> Option { + match egraph[members_id].data.find_member_by_column(column) { + Some(((_, Member::Dimension { name, .. }, _), _)) + | Some(((_, Member::TimeDimension { name, .. }, _), _)) => Some(name.clone()), + _ => None, + } + } + + let resolve_underlying = |member_name: &str| -> String { + meta_context + .find_dimension_with_name(member_name) + .and_then(|dim| dim.alias_member.clone()) + .unwrap_or_else(|| member_name.to_string()) + }; + + // Both sides must be truncated to the same granularity for the + // stitch key to line up. + let Some(left_granularity) = + var_iter!(egraph[subst[left_granularity_var]], LiteralExprValue) + .find_map(|v| utils::parse_granularity(v, false)) + else { + return false; + }; + let Some(right_granularity) = + var_iter!(egraph[subst[right_granularity_var]], LiteralExprValue) + .find_map(|v| utils::parse_granularity(v, false)) + else { + return false; + }; + if left_granularity != right_granularity { + return false; + } + + let Some(binary_left_col) = var_iter!(egraph[subst[left_column_var]], ColumnExprColumn) + .next() + .cloned() + else { + return false; + }; + let Some(binary_right_col) = + var_iter!(egraph[subst[right_column_var]], ColumnExprColumn) + .next() + .cloned() + else { + return false; + }; + + // The equality columns may be written in either order relative to + // the cross-join sides, so resolve each against both scans and pick + // the assignment where one column belongs to the left scan and the + // other to the right. + let bl_on_left = + dimension_member_name(egraph, subst[left_members_var], &binary_left_col); + let br_on_right = + dimension_member_name(egraph, subst[right_members_var], &binary_right_col); + let br_on_left = + dimension_member_name(egraph, subst[left_members_var], &binary_right_col); + let bl_on_right = + dimension_member_name(egraph, subst[right_members_var], &binary_left_col); + let (left_key, right_key) = if let (Some(l), Some(r)) = (bl_on_left, br_on_right) { + (l, r) + } else if let (Some(l), Some(r)) = (br_on_left, bl_on_right) { + (l, r) + } else { + return false; + }; + + if resolve_underlying(&left_key) != resolve_underlying(&right_key) { + return false; + } + let Some(left_cube) = left_key.split('.').next().map(|s| s.to_string()) else { + return false; + }; + let Some(right_cube) = right_key.split('.').next().map(|s| s.to_string()) else { + return false; + }; + + // A filtered cross join is an INNER join: require both keys present. + let presence_members: Vec = vec![left_key.clone(), right_key.clone()]; + let mut join_member_names: Vec<(String, Option)> = vec![( + resolve_underlying(&left_key), + Some(left_granularity.clone()), + )]; + join_member_names.sort(); + join_member_names.dedup(); + + for left_alias_to_cube in + var_iter!(egraph[subst[left_alias_to_cube_var]], CubeScanAliasToCube) + { + for right_alias_to_cube in + var_iter!(egraph[subst[right_alias_to_cube_var]], CubeScanAliasToCube) + { + for left_join_hints in + var_iter!(egraph[subst[left_join_hints_var]], CubeScanJoinHints) + { + for right_join_hints in + var_iter!(egraph[subst[right_join_hints_var]], CubeScanJoinHints) + { + let out_alias_to_cube = CubeScanAliasToCube( + left_alias_to_cube + .iter() + .chain(right_alias_to_cube.iter()) + .cloned() + .collect(), + ); + + let out_join_hints = CubeScanJoinHints( + left_join_hints + .iter() + .chain(right_join_hints.iter()) + .cloned() + .chain(iter::once(vec![left_cube.clone(), right_cube.clone()])) + .collect(), + ); + + subst.insert( + out_alias_to_cube_var, + egraph.add(LogicalPlanLanguage::CubeScanAliasToCube( + out_alias_to_cube, + )), + ); + + subst.insert( + out_join_hints_var, + egraph.add(LogicalPlanLanguage::CubeScanJoinHints(out_join_hints)), + ); + + let join_members_id = + egraph.add(LogicalPlanLanguage::MultiFactJoinWrapperJoinMembers( + MultiFactJoinWrapperJoinMembers(join_member_names.clone()), + )); + subst.insert(join_members_var, join_members_id); + + let mut acc = subst[left_filters_var]; + for name in &presence_members { + let member = egraph.add(LogicalPlanLanguage::FilterMemberMember( + crate::compile::rewrite::FilterMemberMember(name.clone()), + )); + let op = egraph.add(LogicalPlanLanguage::FilterMemberOp( + crate::compile::rewrite::FilterMemberOp("set".to_string()), + )); + let values = egraph.add(LogicalPlanLanguage::FilterMemberValues( + crate::compile::rewrite::FilterMemberValues(vec![]), + )); + let filter_member = egraph + .add(LogicalPlanLanguage::FilterMember([member, op, values])); + acc = egraph.add(LogicalPlanLanguage::CubeScanFilters(vec![ + filter_member, + acc, + ])); + } + subst.insert(left_filters_var, acc); + + return true; + } + } + } + } + + false + } + } + + // Fold a date-truncated equality (DATE_TRUNC(g, a.ts) = DATE_TRUNC(g, b.ts)) + // that sits on top of a MultiFactJoinWrapper into the wrapper's recorded join + // key. Both truncated columns must resolve to the same underlying time member + // (at the same granularity) on the merged scan; both are marked present. + #[allow(clippy::too_many_arguments)] + fn absorb_time_key_into_wrapper( + &self, + members_var: &'static str, + left_column_var: &'static str, + left_granularity_var: &'static str, + right_column_var: &'static str, + right_granularity_var: &'static str, + filters_var: &'static str, + out_filters_var: &'static str, + prev_join_members_var: &'static str, + join_members_var: &'static str, + ) -> impl Fn(&mut CubeEGraph, &mut Subst) -> bool { + let members_var = var!(members_var); + let left_column_var = var!(left_column_var); + let left_granularity_var = var!(left_granularity_var); + let right_column_var = var!(right_column_var); + let right_granularity_var = var!(right_granularity_var); + let filters_var = var!(filters_var); + let out_filters_var = var!(out_filters_var); + let prev_join_members_var = var!(prev_join_members_var); + let join_members_var = var!(join_members_var); + let meta_context = self.meta_context.clone(); + let enable_tesseract_sql_planner = self.config_obj.enable_tesseract_sql_planner(); + move |egraph, subst| { + if !enable_tesseract_sql_planner { + return false; + } + fn dimension_member_name( + egraph: &mut CubeEGraph, + members_id: Id, + column: &Column, + ) -> Option { + match egraph[members_id].data.find_member_by_column(column) { + Some(((_, Member::Dimension { name, .. }, _), _)) + | Some(((_, Member::TimeDimension { name, .. }, _), _)) => Some(name.clone()), + _ => None, + } + } + + let resolve_underlying = |member_name: &str| -> String { + meta_context + .find_dimension_with_name(member_name) + .and_then(|dim| dim.alias_member.clone()) + .unwrap_or_else(|| member_name.to_string()) + }; + + let Some(left_granularity) = + var_iter!(egraph[subst[left_granularity_var]], LiteralExprValue) + .find_map(|v| utils::parse_granularity(v, false)) + else { + return false; + }; + let Some(right_granularity) = + var_iter!(egraph[subst[right_granularity_var]], LiteralExprValue) + .find_map(|v| utils::parse_granularity(v, false)) + else { + return false; + }; + if left_granularity != right_granularity { + return false; + } + + let Some(left_col) = var_iter!(egraph[subst[left_column_var]], ColumnExprColumn) + .next() + .cloned() + else { + return false; + }; + let Some(right_col) = var_iter!(egraph[subst[right_column_var]], ColumnExprColumn) + .next() + .cloned() + else { + return false; + }; + + // Both columns live on the merged scan; resolve them to time members. + let Some(left_key) = dimension_member_name(egraph, subst[members_var], &left_col) + else { + return false; + }; + let Some(right_key) = dimension_member_name(egraph, subst[members_var], &right_col) + else { + return false; + }; + if resolve_underlying(&left_key) != resolve_underlying(&right_key) { + return false; + } + + // Time key recorded for the GROUP BY check at finalize, unioned with + // the keys already on the wrapper. + let mut join_member_names: Vec<(String, Option)> = vec![( + resolve_underlying(&left_key), + Some(left_granularity.clone()), + )]; + if let Some(prev) = var_iter!( + egraph[subst[prev_join_members_var]], + MultiFactJoinWrapperJoinMembers + ) + .next() + { + join_member_names.extend(prev.iter().cloned()); + } + join_member_names.sort(); + join_member_names.dedup(); + + let join_members_id = egraph.add(LogicalPlanLanguage::MultiFactJoinWrapperJoinMembers( + MultiFactJoinWrapperJoinMembers(join_member_names), + )); + subst.insert(join_members_var, join_members_id); + + // INNER on the time key: both columns must be present. + let presence_members = [left_key, right_key]; + let mut acc = subst[filters_var]; + for name in &presence_members { + let member = egraph.add(LogicalPlanLanguage::FilterMemberMember( + crate::compile::rewrite::FilterMemberMember(name.clone()), + )); + let op = egraph.add(LogicalPlanLanguage::FilterMemberOp( + crate::compile::rewrite::FilterMemberOp("set".to_string()), + )); + let values = egraph.add(LogicalPlanLanguage::FilterMemberValues( + crate::compile::rewrite::FilterMemberValues(vec![]), + )); + let filter_member = + egraph.add(LogicalPlanLanguage::FilterMember([member, op, values])); + acc = egraph.add(LogicalPlanLanguage::CubeScanFilters(vec![ + filter_member, + acc, + ])); + } + subst.insert(out_filters_var, acc); + + true + } + } + + // Finalize a MultiFactJoinWrapper: only unwrap it (letting the standard + // aggregate push-down produce the merged multi-fact CubeScan) when the + // query's GROUP BY is exactly the recorded shared join key. This rejects + // ungrouped queries (no aggregate matches) and queries grouping by a + // non-join-key dimension. + fn finalize_shared_member_join( + &self, + scan_var: &'static str, + join_members_var: &'static str, + group_expr_var: &'static str, + ) -> impl Fn(&mut CubeEGraph, &mut Subst) -> bool { + let scan_var = var!(scan_var); + let join_members_var = var!(join_members_var); + let group_expr_var = var!(group_expr_var); + let meta_context = self.meta_context.clone(); + move |egraph, subst| { + fn dimension_member_name( + egraph: &mut CubeEGraph, + scan_id: Id, + column: &Column, + ) -> Option { + match egraph[scan_id].data.find_member_by_column(column) { + Some(((_, Member::Dimension { name, .. }, _), _)) + | Some(((_, Member::TimeDimension { name, .. }, _), _)) => Some(name.clone()), + _ => None, + } + } + + let resolve_underlying = |member_name: &str| -> String { + meta_context + .find_dimension_with_name(member_name) + .and_then(|dim| dim.alias_member.clone()) + .unwrap_or_else(|| member_name.to_string()) + }; + + let join_members: Vec<(String, Option)> = match var_iter!( + egraph[subst[join_members_var]], + MultiFactJoinWrapperJoinMembers + ) + .next() + { + Some(jm) => jm.clone(), + None => return false, + }; + if join_members.is_empty() { + return false; + } + let join_set: HashSet<(String, Option)> = join_members.into_iter().collect(); + + // The actual GROUP BY expressions (with their full structure, so a + // `DATE_TRUNC(g, col)` keeps its granularity). `referenced_expr` + // can't be used here because it collapses a wrapped expression to + // its inner column and would drop the grain. + let group_child_ids: Vec = + match var_list_iter!(egraph[subst[group_expr_var]], AggregateGroupExpr).next() { + Some(ids) => ids.clone(), + None => return false, + }; + if group_child_ids.is_empty() { + return false; + } + + // Every GROUP BY expression must be either a plain dimension column + // (no granularity) or a `DATE_TRUNC(g, col)` over one, and the + // resulting (underlying member, granularity) pair must be part of + // the recorded join key. A join on `DATE_TRUNC('month', ...)` paired + // with `GROUP BY DATE_TRUNC('day', ...)` therefore won't merge: the + // multi-fact stitch happens at the GROUP BY grain, which must match + // the grain the user joined on. + let mut group_set: HashSet<(String, Option)> = HashSet::new(); + for child_id in &group_child_ids { + let Some(OriginalExpr::Expr(expr)) = egraph[*child_id].data.original_expr.clone() + else { + return false; + }; + let (column, granularity) = match &expr { + Expr::Column(col) => (col.clone(), None), + Expr::ScalarFunction { + fun: BuiltinScalarFunction::DateTrunc, + args, + } if args.len() == 2 => { + let Expr::Literal(scalar) = &args[0] else { + return false; + }; + let Some(granularity) = utils::parse_granularity(scalar, false) else { + return false; + }; + let Expr::Column(col) = &args[1] else { + return false; + }; + (col.clone(), Some(granularity)) + } + _ => return false, + }; + let Some(member_name) = dimension_member_name(egraph, subst[scan_var], &column) + else { + return false; + }; + group_set.insert((resolve_underlying(&member_name), granularity)); + } + + // GROUP BY must match the join key exactly, member and grain. + group_set == join_set + } + } + fn push_down_cross_join_to_cubescan_rewrite( &self, name: &str, diff --git a/rust/cubesql/cubesql/src/compile/test/mod.rs b/rust/cubesql/cubesql/src/compile/test/mod.rs index 20e63e584b5f8..cfefbc79da11d 100644 --- a/rust/cubesql/cubesql/src/compile/test/mod.rs +++ b/rust/cubesql/cubesql/src/compile/test/mod.rs @@ -33,6 +33,8 @@ pub mod test_cube_join; #[cfg(test)] pub mod test_cube_join_grouped; #[cfg(test)] +pub mod test_cube_join_views; +#[cfg(test)] pub mod test_cube_scan; #[cfg(test)] pub mod test_df_execution; diff --git a/rust/cubesql/cubesql/src/compile/test/test_cube_join_views.rs b/rust/cubesql/cubesql/src/compile/test/test_cube_join_views.rs new file mode 100644 index 0000000000000..ee72cca37f4a0 --- /dev/null +++ b/rust/cubesql/cubesql/src/compile/test/test_cube_join_views.rs @@ -0,0 +1,897 @@ +use std::sync::Arc; + +use cubeclient::models::{ + V1CubeMetaType, V1LoadRequestQuery, V1LoadRequestQueryFilterItem, + V1LoadRequestQueryTimeDimension, +}; +use pretty_assertions::assert_eq; + +use crate::{ + compile::{ + rewrite::rewriter::Rewriter, + test::{ + convert_sql_to_cube_query, get_test_session_with_config, get_test_tenant_ctx_with_meta, + init_testing_logger, utils::LogicalPlanTestUtils, + }, + CompilationError, DatabaseProtocol, QueryPlan, + }, + config::{ConfigObj, ConfigObjImpl}, + transport::{CubeMeta, CubeMetaDimension, CubeMetaMeasure}, +}; + +/// Two views that both expose the same underlying `customers.customer_city` +/// dimension (via `aliasMember`). `orders_view` carries an `orders` measure +/// while `customers_view` carries a `customers` measure, so a query that +/// touches both is a multi-fact query joined on the shared key. +fn views_meta() -> Vec { + let dimension = |name: &str, alias: &str| CubeMetaDimension { + name: name.to_string(), + r#type: "string".to_string(), + alias_member: Some(alias.to_string()), + ..CubeMetaDimension::default() + }; + let time_dimension = |name: &str, alias: &str| CubeMetaDimension { + name: name.to_string(), + r#type: "time".to_string(), + alias_member: Some(alias.to_string()), + ..CubeMetaDimension::default() + }; + let measure = |name: &str, alias: &str, agg: &str| CubeMetaMeasure { + name: name.to_string(), + title: None, + short_title: None, + description: None, + r#type: "number".to_string(), + agg_type: Some(agg.to_string()), + meta: None, + alias_member: Some(alias.to_string()), + format: None, + format_description: None, + currency: None, + }; + + vec![ + CubeMeta { + name: "customers_view".to_string(), + description: None, + title: None, + r#type: V1CubeMetaType::View, + dimensions: vec![ + dimension("customers_view.customer_city", "customers.customer_city"), + dimension("customers_view.customer_state", "customers.customer_state"), + // A second dimension that is NOT a join key, used to test that a + // query grouping by it (instead of the join key) is not merged. + dimension("customers_view.status", "customers.status"), + time_dimension("customers_view.created_at", "customers.created_at"), + ], + measures: vec![measure( + "customers_view.avg_age", + "customers.avg_age", + "avg", + )], + segments: vec![], + joins: None, + folders: None, + nested_folders: None, + hierarchies: None, + meta: None, + }, + CubeMeta { + name: "orders_view".to_string(), + description: None, + title: None, + r#type: V1CubeMetaType::View, + dimensions: vec![ + dimension("orders_view.customer_city", "customers.customer_city"), + dimension("orders_view.customer_state", "customers.customer_state"), + dimension("orders_view.status", "orders.status"), + time_dimension("orders_view.created_at", "customers.created_at"), + ], + measures: vec![measure("orders_view.revenue", "orders.revenue", "sum")], + segments: vec![], + joins: None, + folders: None, + nested_folders: None, + hierarchies: None, + meta: None, + }, + CubeMeta { + name: "returns_view".to_string(), + description: None, + title: None, + r#type: V1CubeMetaType::View, + dimensions: vec![dimension( + "returns_view.customer_city", + "customers.customer_city", + )], + measures: vec![measure("returns_view.refunds", "returns.refunds", "sum")], + segments: vec![], + joins: None, + folders: None, + nested_folders: None, + hierarchies: None, + meta: None, + }, + CubeMeta { + name: "payments_view".to_string(), + description: None, + title: None, + r#type: V1CubeMetaType::View, + dimensions: vec![dimension( + "payments_view.customer_city", + "customers.customer_city", + )], + measures: vec![measure("payments_view.paid", "payments.paid", "sum")], + segments: vec![], + joins: None, + folders: None, + nested_folders: None, + hierarchies: None, + meta: None, + }, + ] +} + +fn set_filter(member: &str) -> V1LoadRequestQueryFilterItem { + V1LoadRequestQueryFilterItem { + member: Some(member.to_string()), + operator: Some("set".to_string()), + values: None, + or: None, + and: None, + } +} + +/// Plans `sql` against the two views, with the Tesseract SQL planner enabled or +/// disabled. The shared-member view-join merge only fires when Tesseract is +/// enabled. +async fn plan_view_join(sql: &str, tesseract: bool) -> Result { + let meta = get_test_tenant_ctx_with_meta(views_meta()); + let mut config = ConfigObjImpl::default(); + config.tesseract_sql_planner = tesseract; + let config: Arc = Arc::new(config); + let session = + get_test_session_with_config(DatabaseProtocol::PostgreSQL, config, meta.clone()).await; + convert_sql_to_cube_query(&sql.to_string(), meta, session).await +} + +const GROUPED_LEFT_JOIN: &str = r#" + SELECT c.customer_city, measure(o.revenue), measure(c.avg_age) + FROM customers_view c + LEFT JOIN orders_view o ON o.customer_city = c.customer_city + GROUP BY 1 +"#; + +/// The motivating query: a grouped (multi-fact) `LEFT JOIN` selecting a +/// dimension and measures from each view, joined on the shared `customer_city` +/// which is also the GROUP BY key. The two view scans are merged into a single +/// grouped CubeScan, and the left join key gets a `set` filter to recover +/// LEFT-join semantics on top of the FULL OUTER multi-fact stitch. +#[tokio::test] +async fn test_group_by_left_join_two_views_on_shared_member() { + if !Rewriter::sql_push_down_enabled() { + return; + } + init_testing_logger(); + + let logical_plan = plan_view_join(GROUPED_LEFT_JOIN, true) + .await + .unwrap() + .as_logical_plan(); + + assert_eq!( + logical_plan.find_cube_scan().request, + V1LoadRequestQuery { + measures: Some(vec![ + "orders_view.revenue".to_string(), + "customers_view.avg_age".to_string(), + ]), + dimensions: Some(vec!["customers_view.customer_city".to_string()]), + segments: Some(vec![]), + order: Some(vec![]), + filters: Some(vec![set_filter("customers_view.customer_city")]), + join_hints: Some(vec![vec![ + "customers_view".to_string(), + "orders_view".to_string(), + ]]), + ..Default::default() + } + ) +} + +/// Same shape but `INNER JOIN`: both sides must be present, so the merged scan +/// carries a `set` filter on the join key of each side. +#[tokio::test] +async fn test_group_by_inner_join_two_views_on_shared_member() { + if !Rewriter::sql_push_down_enabled() { + return; + } + init_testing_logger(); + + let logical_plan = plan_view_join( + r#" + SELECT c.customer_city, measure(o.revenue), measure(c.avg_age) + FROM customers_view c + INNER JOIN orders_view o ON o.customer_city = c.customer_city + GROUP BY 1 + "#, + true, + ) + .await + .unwrap() + .as_logical_plan(); + + assert_eq!( + logical_plan.find_cube_scan().request, + V1LoadRequestQuery { + measures: Some(vec![ + "orders_view.revenue".to_string(), + "customers_view.avg_age".to_string(), + ]), + dimensions: Some(vec!["customers_view.customer_city".to_string()]), + segments: Some(vec![]), + order: Some(vec![]), + filters: Some(vec![ + set_filter("orders_view.customer_city"), + set_filter("customers_view.customer_city"), + ]), + join_hints: Some(vec![vec![ + "customers_view".to_string(), + "orders_view".to_string(), + ]]), + ..Default::default() + } + ) +} + +/// The merge relies on the Tesseract SQL planner; with it disabled the join is +/// not merged and the query is rejected like any other unsupported cube join. +#[tokio::test] +async fn test_grouped_view_join_not_merged_without_tesseract() { + if !Rewriter::sql_push_down_enabled() { + return; + } + init_testing_logger(); + + let error = plan_view_join(GROUPED_LEFT_JOIN, false).await.unwrap_err(); + assert!(matches!(error, CompilationError::Rewrite(..))); +} + +/// Ungrouped query (`SELECT *`): the shared-member merge only applies to +/// grouped queries, so an ungrouped join is not merged and is rejected even +/// when Tesseract is enabled. +#[tokio::test] +async fn test_ungrouped_join_two_views_on_shared_member_is_not_merged() { + if !Rewriter::sql_push_down_enabled() { + return; + } + init_testing_logger(); + + let error = plan_view_join( + r#" + SELECT * + FROM customers_view + LEFT JOIN orders_view + ON (orders_view.customer_city = customers_view.customer_city) + "#, + true, + ) + .await + .unwrap_err(); + assert!(matches!(error, CompilationError::Rewrite(..))); +} + +/// The join is over a dimension (`customer_city`) that is not in the GROUP BY +/// (the query groups by `status` instead). The merge requires the join key to +/// be the group-by key, so this is not merged and is rejected. +#[tokio::test] +async fn test_group_by_join_dimension_not_in_group_by_is_not_merged() { + if !Rewriter::sql_push_down_enabled() { + return; + } + init_testing_logger(); + + let error = plan_view_join( + r#" + SELECT c.status, measure(o.revenue), measure(c.avg_age) + FROM customers_view c + LEFT JOIN orders_view o ON o.customer_city = c.customer_city + GROUP BY 1 + "#, + true, + ) + .await + .unwrap_err(); + assert!(matches!(error, CompilationError::Rewrite(..))); +} + +/// The merge only fires when the join key is fully within dimensions. Joining +/// the two views on a measure (`o.revenue = c.avg_age`) is not a shared-member +/// dimension join, so the scans are not merged and the query is rejected. +#[tokio::test] +async fn test_join_two_views_on_measure_is_not_merged() { + if !Rewriter::sql_push_down_enabled() { + return; + } + init_testing_logger(); + + let error = plan_view_join( + r#" + SELECT c.customer_city, measure(o.revenue) + FROM customers_view c + LEFT JOIN orders_view o ON (o.revenue = c.avg_age) + GROUP BY 1 + "#, + true, + ) + .await + .unwrap_err(); + assert!(matches!(error, CompilationError::Rewrite(..))); +} + +/// `RIGHT JOIN`: the right side must be present, so the merged scan carries a +/// `set` filter on the right join key. +#[tokio::test] +async fn test_group_by_right_join_two_views_on_shared_member() { + if !Rewriter::sql_push_down_enabled() { + return; + } + init_testing_logger(); + + let logical_plan = plan_view_join( + r#" + SELECT c.customer_city, measure(o.revenue), measure(c.avg_age) + FROM customers_view c + RIGHT JOIN orders_view o ON o.customer_city = c.customer_city + GROUP BY 1 + "#, + true, + ) + .await + .unwrap() + .as_logical_plan(); + + assert_eq!( + logical_plan.find_cube_scan().request, + V1LoadRequestQuery { + measures: Some(vec![ + "orders_view.revenue".to_string(), + "customers_view.avg_age".to_string(), + ]), + dimensions: Some(vec!["customers_view.customer_city".to_string()]), + segments: Some(vec![]), + order: Some(vec![]), + filters: Some(vec![set_filter("orders_view.customer_city")]), + join_hints: Some(vec![vec![ + "customers_view".to_string(), + "orders_view".to_string(), + ]]), + ..Default::default() + } + ) +} + +/// `FULL JOIN`: every key from either side is kept (default multi-fact +/// behavior), so no presence `set` filter is added. +#[tokio::test] +async fn test_group_by_full_join_two_views_on_shared_member() { + if !Rewriter::sql_push_down_enabled() { + return; + } + init_testing_logger(); + + let logical_plan = plan_view_join( + r#" + SELECT c.customer_city, measure(o.revenue), measure(c.avg_age) + FROM customers_view c + FULL JOIN orders_view o ON o.customer_city = c.customer_city + GROUP BY 1 + "#, + true, + ) + .await + .unwrap() + .as_logical_plan(); + + assert_eq!( + logical_plan.find_cube_scan().request, + V1LoadRequestQuery { + measures: Some(vec![ + "orders_view.revenue".to_string(), + "customers_view.avg_age".to_string(), + ]), + dimensions: Some(vec!["customers_view.customer_city".to_string()]), + segments: Some(vec![]), + order: Some(vec![]), + // FULL JOIN adds no presence filter. + filters: None, + join_hints: Some(vec![vec![ + "customers_view".to_string(), + "orders_view".to_string(), + ]]), + ..Default::default() + } + ) +} + +/// Joining three views on the shared key (FULL JOIN, so no presence filters) +/// merges into a single multi-fact CubeScan with all three measures. +#[tokio::test] +async fn test_group_by_full_join_three_views_on_shared_member() { + if !Rewriter::sql_push_down_enabled() { + return; + } + init_testing_logger(); + + let logical_plan = plan_view_join( + r#" + SELECT c.customer_city, measure(o.revenue), measure(r.refunds) + FROM customers_view c + FULL JOIN orders_view o ON o.customer_city = c.customer_city + FULL JOIN returns_view r ON r.customer_city = c.customer_city + GROUP BY 1 + "#, + true, + ) + .await + .unwrap() + .as_logical_plan(); + + assert_eq!( + logical_plan.find_cube_scan().request, + V1LoadRequestQuery { + measures: Some(vec![ + "orders_view.revenue".to_string(), + "returns_view.refunds".to_string(), + ]), + dimensions: Some(vec!["customers_view.customer_city".to_string()]), + segments: Some(vec![]), + order: Some(vec![]), + join_hints: Some(vec![ + vec!["customers_view".to_string(), "orders_view".to_string()], + vec!["customers_view".to_string(), "returns_view".to_string()], + ]), + ..Default::default() + } + ) +} + +/// Joining four views on the shared key (FULL JOIN) merges into a single +/// multi-fact CubeScan with all four measures. +#[tokio::test] +async fn test_group_by_full_join_four_views_on_shared_member() { + if !Rewriter::sql_push_down_enabled() { + return; + } + init_testing_logger(); + + let logical_plan = plan_view_join( + r#" + SELECT c.customer_city, measure(o.revenue), measure(r.refunds), measure(p.paid) + FROM customers_view c + FULL JOIN orders_view o ON o.customer_city = c.customer_city + FULL JOIN returns_view r ON r.customer_city = c.customer_city + FULL JOIN payments_view p ON p.customer_city = c.customer_city + GROUP BY 1 + "#, + true, + ) + .await + .unwrap() + .as_logical_plan(); + + assert_eq!( + logical_plan.find_cube_scan().request, + V1LoadRequestQuery { + measures: Some(vec![ + "orders_view.revenue".to_string(), + "returns_view.refunds".to_string(), + "payments_view.paid".to_string(), + ]), + dimensions: Some(vec!["customers_view.customer_city".to_string()]), + segments: Some(vec![]), + order: Some(vec![]), + join_hints: Some(vec![ + vec!["customers_view".to_string(), "orders_view".to_string()], + vec!["customers_view".to_string(), "returns_view".to_string()], + vec!["customers_view".to_string(), "payments_view".to_string()], + ]), + ..Default::default() + } + ) +} + +/// A WHERE filter on top of the join is pushed through the wrapper into the +/// merged scan and shows up as a Cube query filter alongside the join-semantics +/// `set` filter. +#[tokio::test] +async fn test_group_by_left_join_with_where_filter() { + if !Rewriter::sql_push_down_enabled() { + return; + } + init_testing_logger(); + + let logical_plan = plan_view_join( + r#" + SELECT c.customer_city, measure(o.revenue) + FROM customers_view c + LEFT JOIN orders_view o ON o.customer_city = c.customer_city + WHERE c.status = 'active' + GROUP BY 1 + "#, + true, + ) + .await + .unwrap() + .as_logical_plan(); + + assert_eq!( + logical_plan.find_cube_scan().request, + V1LoadRequestQuery { + measures: Some(vec!["orders_view.revenue".to_string()]), + dimensions: Some(vec!["customers_view.customer_city".to_string()]), + segments: Some(vec![]), + order: Some(vec![]), + filters: Some(vec![ + set_filter("customers_view.customer_city"), + V1LoadRequestQueryFilterItem { + member: Some("customers_view.status".to_string()), + operator: Some("equals".to_string()), + values: Some(vec!["active".to_string()]), + or: None, + and: None, + }, + ]), + join_hints: Some(vec![vec![ + "customers_view".to_string(), + "orders_view".to_string(), + ]]), + ..Default::default() + } + ) +} + +/// A filter placed in the ON clause (in addition to the shared-key equality). +#[tokio::test] +async fn test_group_by_left_join_with_on_filter() { + if !Rewriter::sql_push_down_enabled() { + return; + } + init_testing_logger(); + + let logical_plan = plan_view_join( + r#" + SELECT c.customer_city, measure(o.revenue) + FROM customers_view c + LEFT JOIN orders_view o + ON o.customer_city = c.customer_city AND o.status = 'completed' + GROUP BY 1 + "#, + true, + ) + .await + .unwrap() + .as_logical_plan(); + + assert_eq!( + logical_plan.find_cube_scan().request, + V1LoadRequestQuery { + measures: Some(vec!["orders_view.revenue".to_string()]), + dimensions: Some(vec!["customers_view.customer_city".to_string()]), + segments: Some(vec![]), + order: Some(vec![]), + filters: Some(vec![ + set_filter("customers_view.customer_city"), + V1LoadRequestQueryFilterItem { + member: Some("orders_view.status".to_string()), + operator: Some("equals".to_string()), + values: Some(vec!["completed".to_string()]), + or: None, + and: None, + }, + ]), + join_hints: Some(vec![vec![ + "customers_view".to_string(), + "orders_view".to_string(), + ]]), + ..Default::default() + } + ) +} + +/// A 3-way LEFT join pins the per-pass presence-filter accumulation through +/// `shared-member-join-extend-wrapper`: each LEFT join contributes a `set` +/// filter on its own left-side join key. +#[tokio::test] +async fn test_group_by_left_join_three_views_presence_filters() { + if !Rewriter::sql_push_down_enabled() { + return; + } + init_testing_logger(); + + let logical_plan = plan_view_join( + r#" + SELECT c.customer_city, measure(o.revenue), measure(r.refunds) + FROM customers_view c + LEFT JOIN orders_view o ON o.customer_city = c.customer_city + LEFT JOIN returns_view r ON r.customer_city = o.customer_city + GROUP BY 1 + "#, + true, + ) + .await + .unwrap() + .as_logical_plan(); + + assert_eq!( + logical_plan.find_cube_scan().request, + V1LoadRequestQuery { + measures: Some(vec![ + "orders_view.revenue".to_string(), + "returns_view.refunds".to_string(), + ]), + dimensions: Some(vec!["customers_view.customer_city".to_string()]), + segments: Some(vec![]), + order: Some(vec![]), + filters: Some(vec![ + set_filter("orders_view.customer_city"), + set_filter("customers_view.customer_city"), + ]), + join_hints: Some(vec![ + vec!["customers_view".to_string(), "orders_view".to_string()], + vec!["orders_view".to_string(), "returns_view".to_string()], + ]), + ..Default::default() + } + ) +} + +/// The join-key granularity must match the GROUP BY granularity: the multi-fact +/// stitch happens at the GROUP BY grain, so joining on `DATE_TRUNC('month', ...)` +/// while grouping by `DATE_TRUNC('day', ...)` must not merge (it would silently +/// stitch at day grain, diverging from the month-grain join). +#[tokio::test] +async fn test_join_date_trunc_granularity_mismatch_is_not_merged() { + if !Rewriter::sql_push_down_enabled() { + return; + } + init_testing_logger(); + + let request = plan_view_join( + r#" + SELECT DATE_TRUNC('day', c.created_at), measure(o.revenue) + FROM customers_view c + JOIN orders_view o + ON DATE_TRUNC('month', o.created_at) = DATE_TRUNC('month', c.created_at) + GROUP BY 1 + "#, + true, + ) + .await + .unwrap() + .as_logical_plan() + .find_cube_scan() + .request; + + // The merge did not happen: rather than a single grouped multi-fact scan, + // the query falls back to the raw ungrouped cross-join scan (pushed to the + // cube as SQL), so the measure is not pushed and the scan stays ungrouped. + assert_eq!( + request.ungrouped, + Some(true), + "expected month-grain join with day-grain GROUP BY not to merge, got: {:?}", + request + ); + assert_eq!(request.measures, Some(vec![]), "got: {:?}", request); +} + +/// Joining two views directly on `DATE_TRUNC('day', ...)` (which the SQL planner +/// lowers to `Filter(, CrossJoin(...))`, i.e. an INNER join) merges into a +/// single multi-fact CubeScan. Both truncated keys are marked present (INNER). +#[tokio::test] +async fn test_inner_join_on_date_trunc_group_by_date_trunc() { + if !Rewriter::sql_push_down_enabled() { + return; + } + init_testing_logger(); + + let logical_plan = plan_view_join( + r#" + SELECT DATE_TRUNC('day', c.created_at), measure(o.revenue) + FROM customers_view c + LEFT JOIN orders_view o + ON DATE_TRUNC('day', o.created_at) = DATE_TRUNC('day', c.created_at) + GROUP BY 1 + "#, + true, + ) + .await + .unwrap() + .as_logical_plan(); + + assert_eq!( + logical_plan.find_cube_scan().request, + V1LoadRequestQuery { + measures: Some(vec!["orders_view.revenue".to_string()]), + dimensions: Some(vec![]), + segments: Some(vec![]), + time_dimensions: Some(vec![V1LoadRequestQueryTimeDimension { + dimension: "customers_view.created_at".to_string(), + granularity: Some("day".to_string()), + date_range: None, + }]), + order: Some(vec![]), + filters: Some(vec![ + set_filter("orders_view.created_at"), + set_filter("customers_view.created_at"), + ]), + join_hints: Some(vec![vec![ + "customers_view".to_string(), + "orders_view".to_string(), + ]]), + ..Default::default() + } + ) +} + +/// Joining two views on a composite key (two shared dimensions) and grouping by +/// both merges into a single multi-fact CubeScan. The GROUP BY must cover the +/// full join key, and each LEFT-join key column is marked present. +#[tokio::test] +async fn test_left_join_on_multiple_dimensions_group_by_both() { + if !Rewriter::sql_push_down_enabled() { + return; + } + init_testing_logger(); + + let logical_plan = plan_view_join( + r#" + SELECT c.customer_city, c.customer_state, measure(o.revenue) + FROM customers_view c + LEFT JOIN orders_view o + ON o.customer_city = c.customer_city + AND o.customer_state = c.customer_state + GROUP BY 1, 2 + "#, + true, + ) + .await + .unwrap() + .as_logical_plan(); + + assert_eq!( + logical_plan.find_cube_scan().request, + V1LoadRequestQuery { + measures: Some(vec!["orders_view.revenue".to_string()]), + dimensions: Some(vec![ + "customers_view.customer_city".to_string(), + "customers_view.customer_state".to_string(), + ]), + segments: Some(vec![]), + order: Some(vec![]), + filters: Some(vec![ + set_filter("customers_view.customer_state"), + set_filter("customers_view.customer_city"), + ]), + join_hints: Some(vec![vec![ + "customers_view".to_string(), + "orders_view".to_string(), + ]]), + ..Default::default() + } + ) +} + +/// Grouping by only part of a composite join key must not merge: the GROUP BY +/// must cover the full join key, so this falls back to standard join handling +/// (which errors for ungrouped-style cube joins). +#[tokio::test] +async fn test_join_on_multiple_dimensions_partial_group_by_is_not_merged() { + if !Rewriter::sql_push_down_enabled() { + return; + } + init_testing_logger(); + + let result = plan_view_join( + r#" + SELECT c.customer_city, measure(o.revenue) + FROM customers_view c + LEFT JOIN orders_view o + ON o.customer_city = c.customer_city + AND o.customer_state = c.customer_state + GROUP BY 1 + "#, + true, + ) + .await; + + assert!( + result.is_err(), + "expected partial-group-by composite join not to merge, got: {:?}", + result.map(|p| p.as_logical_plan().find_cube_scan().request) + ); +} + +/// Joining on a mix of a `DATE_TRUNC` equality and a plain dimension equality. +/// The SQL planner makes the column equality the join key and keeps the +/// truncated-time equality as a filter on top; the rewrite folds the time +/// member into the join key so the whole thing merges into one multi-fact scan +/// grouped by the time dimension and the dimension. Both join (INNER) keys and +/// the absorbed time key are marked present. +#[tokio::test] +async fn test_inner_join_on_date_trunc_and_dimension() { + if !Rewriter::sql_push_down_enabled() { + return; + } + init_testing_logger(); + + let logical_plan = plan_view_join( + r#" + SELECT DATE_TRUNC('day', c.created_at), c.customer_city, measure(o.revenue) + FROM customers_view c + JOIN orders_view o + ON DATE_TRUNC('day', o.created_at) = DATE_TRUNC('day', c.created_at) + AND o.customer_city = c.customer_city + GROUP BY 1, 2 + "#, + true, + ) + .await + .unwrap() + .as_logical_plan(); + + assert_eq!( + logical_plan.find_cube_scan().request, + V1LoadRequestQuery { + measures: Some(vec!["orders_view.revenue".to_string()]), + dimensions: Some(vec!["customers_view.customer_city".to_string()]), + segments: Some(vec![]), + time_dimensions: Some(vec![V1LoadRequestQueryTimeDimension { + dimension: "customers_view.created_at".to_string(), + granularity: Some("day".to_string()), + date_range: None, + }]), + order: Some(vec![]), + filters: Some(vec![ + set_filter("customers_view.created_at"), + set_filter("orders_view.created_at"), + set_filter("orders_view.customer_city"), + set_filter("customers_view.customer_city"), + ]), + join_hints: Some(vec![vec![ + "customers_view".to_string(), + "orders_view".to_string(), + ]]), + ..Default::default() + } + ) +} + +/// A join on the raw time column (exact-timestamp equality, "no grain") does not +/// match a truncated `DATE_TRUNC('day', ...)` GROUP BY, so it is not merged: the +/// multi-fact stitch happens at the GROUP BY grain, which must be the grain the +/// user joined on. Truncate the join key to the grain you group by instead. +#[tokio::test] +async fn test_raw_time_join_with_date_trunc_group_by_is_not_merged() { + if !Rewriter::sql_push_down_enabled() { + return; + } + init_testing_logger(); + + let result = plan_view_join( + r#" + SELECT DATE_TRUNC('day', c.created_at), measure(o.revenue) + FROM customers_view c + LEFT JOIN orders_view o ON o.created_at = c.created_at + GROUP BY 1 + "#, + true, + ) + .await; + + assert!( + result.is_err(), + "expected raw-time-column join with a DATE_TRUNC GROUP BY not to merge, got: {:?}", + result.map(|p| p.as_logical_plan().find_cube_scan().request) + ); +} diff --git a/rust/cubesql/cubesql/src/config/mod.rs b/rust/cubesql/cubesql/src/config/mod.rs index d7977a5d4feb7..6dc13293ab64d 100644 --- a/rust/cubesql/cubesql/src/config/mod.rs +++ b/rust/cubesql/cubesql/src/config/mod.rs @@ -117,6 +117,8 @@ pub trait ConfigObj: DIService + Debug { fn max_sessions(&self) -> usize; fn no_implicit_order(&self) -> bool; + + fn enable_tesseract_sql_planner(&self) -> bool; } #[derive(Debug, Clone)] @@ -138,6 +140,7 @@ pub struct ConfigObjImpl { pub non_streaming_query_max_row_limit: i32, pub max_sessions: usize, pub no_implicit_order: bool, + pub tesseract_sql_planner: bool, } impl ConfigObjImpl { @@ -181,6 +184,7 @@ impl ConfigObjImpl { non_streaming_query_max_row_limit: env_parse("CUBEJS_DB_QUERY_LIMIT", 50000), max_sessions: env_parse("CUBEJS_MAX_SESSIONS", 1024), no_implicit_order: env_parse("CUBESQL_SQL_NO_IMPLICIT_ORDER", true), + tesseract_sql_planner: env_parse("CUBEJS_TESSERACT_SQL_PLANNER", false), } } } @@ -251,6 +255,10 @@ impl ConfigObj for ConfigObjImpl { fn max_sessions(&self) -> usize { self.max_sessions } + + fn enable_tesseract_sql_planner(&self) -> bool { + self.tesseract_sql_planner + } } impl Config { @@ -284,6 +292,7 @@ impl Config { non_streaming_query_max_row_limit: 50000, max_sessions: 1024, no_implicit_order: true, + tesseract_sql_planner: false, }), } }