[Maria-developers] Updated (by Psergey): Push conditions down into non-mergeable VIEWs when possible (119)
----------------------------------------------------------------------- WORKLOG TASK -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- TASK...........: Push conditions down into non-mergeable VIEWs when possible CREATION DATE..: Mon, 24 May 2010, 20:52 SUPERVISOR.....: Igor IMPLEMENTOR....: COPIES TO......: Psergey, Timour CATEGORY.......: Server-RawIdeaBin TASK ID........: 119 (http://askmonty.org/worklog/?tid=119) VERSION........: Server-9.x STATUS.........: Un-Assigned PRIORITY.......: 60 WORKED HOURS...: 0 ESTIMATE.......: 0 (hours remain) ORIG. ESTIMATE.: 0 PROGRESS NOTES: -=-=(Psergey - Mon, 24 May 2010, 20:59)=-=- High-Level Specification modified. --- /tmp/wklog.119.old.25116 2010-05-24 20:59:40.000000000 +0000 +++ /tmp/wklog.119.new.25116 2010-05-24 20:59:40.000000000 +0000 @@ -1 +1,113 @@ +<contents> +HLS +1. Problems to be addressed in this WL +2. Pushdown of conditions into non-mergeable VIEWs +2.1 A note about VIEWs with ORDER BY ... LIMIT +2.2 What condition can be pushed +3. Pushdown from HAVING into WHERE +4. When to do the pushdown +5. Other things to take care of + + +</contents> + +1. Problems to be addressed in this WL +====================================== +The problem actually consists of two parts: +1. Condition on VIEW columns are not pushed down into VIEWs. +2. Even if conditions were pushed, they would have been put into VIEW's +HAVING clause, which would not give enough of speedup. In order to get a +real speedup, the optimizer must be able to move relevant part of HAVING +into WHERE (and then use it for further optimizations) in order to provide +the desired speedup. Note that HAVING->WHERE condition move is orthogonal +to VIEW processing. + +2. Pushdown of conditions into non-mergeable VIEWs +================================================== +We can push a condition into non-mergeable VIEW when VIEW's top-most operation +is selection (i.e., filtering). This is true, for example, when the VIEW is +defined as + + SELECT select_list FROM from_clause [WHERE where_cond] [HAVING having_cond] + +and not true when the VIEW is defined as + + SELECT select_list FROM from_clause [WHERE where_cond] ORDER BY expr LIMIT n + +Generalizing the above, we arrive at the following rule: + + For non-mergeable VIEWs, + - pushdown must not be done if VIEW uses ORDER BY .. LIMIT + - when pushdown is done, the pushed condition should be added to the WHERE + clause. + +Note: In scope of this WL, we will not hande VIEWs that have UNION [ALL] as +top operation. + +(TODO: what about SELECT DISTINCT?) +(TODO: pushdown down into IN subqueries?) + +2.1 A note about VIEWs with ORDER BY ... LIMIT +---------------------------------------------- +Although it is not possible to push a condition below the ORDER BY ... LIMIT +operation, there is still some benefit from checking the condition early as +that would allow to avoid writing non-matching rows into temporary table. + +We could do that if we introduced a post-ORDERBY selection operation. That +operation would also allow to support ORDER BY ... LIMIT inside subqueries +(we don't currently support those because default subquery strategy, +IN->EXISTS rewrite, also needs to push a condition into subquery). + +2.2 What condition can be pushed +-------------------------------- +Assuming simplify_joins() operation has done normalization: +* If the VIEW is in top-level join list, or inside a semi-join that's in + top-level join list, we can push parts of WHERE condition. +* If the VIEW is inside an outer join, we can push it's ON expression. + +We can reuse make_cond_for_index()/make_remainder_cond() code to extract part +of condition that can be pushed, and the remainder, respectively. + +Pushability criteria for an atomic (i.e. not AND/OR) condition is that + + the condition only uses VIEW's fields. + +(TODO: what about fields of const tables? Do we have const tables already +retrived by the time VIEW is materialized? If yes, we could push down +expressions that refer to const tables, too) + +3. Pushdown from HAVING into WHERE +================================== +The idea is: + + Parts of HAVING that refer to columns/expressions we're doing GROUP BY on + can be put into WHERE. + +(TODO: do we need to handle case of grouping over expressions?) + +(TODO: when moving expression for HAVING to WHERE, do we need +to do something with it? Replace all Item_ref objects with items that +they refer to? + - In case of referring to expression, do we get + Item_ref(where_clause_expr) or expr( Item_ref(..), .., Item_ref(..))? +) + +4. When to do the pushdown +========================== +In order to do pushdown, we must have prepare phase finished +for both parent (so that we can make sense of its WHERE condition) and +child (so that we know what it has in its select list). + +We can do pushdown before we've done join optimization (i.e. choose_plan() +call) of the parent. + +We must do pushdown before we've done JOIN::optimize() of the child +(in particular, it must be done before we do update_ref_and_keys() and +range analysis in the child). + + +5. Other things to take care of +=============================== +* Pushing down fulltext predicates (it seems one needs to "register" a + fulltext predicate when it is moved from one select from another? Ask Serg) DESCRIPTION: There are complaints (see links below) about cases with non-mergeable VIEW (because it has a GROUP BY), a query that has restrictions on the grouped column, and poor performance that is caused by VIEW processing code ignoring the restriction. This WL is about addressing this issue. links to complaints: http://code.openark.org/blog/mysql/views-better-performance-with-condition-p... http://www.mysqlperformanceblog.com/2010/05/19/a-workaround-for-the-performa... The target version is MariaDB 5.3+, because it has late optimization/execution for FROM-subqueries/non mergeable VIEWs, which makes it much more feasible to inject something into VIEW before it is optimized/executed. HIGH-LEVEL SPECIFICATION: <contents> HLS 1. Problems to be addressed in this WL 2. Pushdown of conditions into non-mergeable VIEWs 2.1 A note about VIEWs with ORDER BY ... LIMIT 2.2 What condition can be pushed 3. Pushdown from HAVING into WHERE 4. When to do the pushdown 5. Other things to take care of </contents> 1. Problems to be addressed in this WL ====================================== The problem actually consists of two parts: 1. Condition on VIEW columns are not pushed down into VIEWs. 2. Even if conditions were pushed, they would have been put into VIEW's HAVING clause, which would not give enough of speedup. In order to get a real speedup, the optimizer must be able to move relevant part of HAVING into WHERE (and then use it for further optimizations) in order to provide the desired speedup. Note that HAVING->WHERE condition move is orthogonal to VIEW processing. 2. Pushdown of conditions into non-mergeable VIEWs ================================================== We can push a condition into non-mergeable VIEW when VIEW's top-most operation is selection (i.e., filtering). This is true, for example, when the VIEW is defined as SELECT select_list FROM from_clause [WHERE where_cond] [HAVING having_cond] and not true when the VIEW is defined as SELECT select_list FROM from_clause [WHERE where_cond] ORDER BY expr LIMIT n Generalizing the above, we arrive at the following rule: For non-mergeable VIEWs, - pushdown must not be done if VIEW uses ORDER BY .. LIMIT - when pushdown is done, the pushed condition should be added to the WHERE clause. Note: In scope of this WL, we will not hande VIEWs that have UNION [ALL] as top operation. (TODO: what about SELECT DISTINCT?) (TODO: pushdown down into IN subqueries?) 2.1 A note about VIEWs with ORDER BY ... LIMIT ---------------------------------------------- Although it is not possible to push a condition below the ORDER BY ... LIMIT operation, there is still some benefit from checking the condition early as that would allow to avoid writing non-matching rows into temporary table. We could do that if we introduced a post-ORDERBY selection operation. That operation would also allow to support ORDER BY ... LIMIT inside subqueries (we don't currently support those because default subquery strategy, IN->EXISTS rewrite, also needs to push a condition into subquery). 2.2 What condition can be pushed -------------------------------- Assuming simplify_joins() operation has done normalization: * If the VIEW is in top-level join list, or inside a semi-join that's in top-level join list, we can push parts of WHERE condition. * If the VIEW is inside an outer join, we can push it's ON expression. We can reuse make_cond_for_index()/make_remainder_cond() code to extract part of condition that can be pushed, and the remainder, respectively. Pushability criteria for an atomic (i.e. not AND/OR) condition is that the condition only uses VIEW's fields. (TODO: what about fields of const tables? Do we have const tables already retrived by the time VIEW is materialized? If yes, we could push down expressions that refer to const tables, too) 3. Pushdown from HAVING into WHERE ================================== The idea is: Parts of HAVING that refer to columns/expressions we're doing GROUP BY on can be put into WHERE. (TODO: do we need to handle case of grouping over expressions?) (TODO: when moving expression for HAVING to WHERE, do we need to do something with it? Replace all Item_ref objects with items that they refer to? - In case of referring to expression, do we get Item_ref(where_clause_expr) or expr( Item_ref(..), .., Item_ref(..))? ) 4. When to do the pushdown ========================== In order to do pushdown, we must have prepare phase finished for both parent (so that we can make sense of its WHERE condition) and child (so that we know what it has in its select list). We can do pushdown before we've done join optimization (i.e. choose_plan() call) of the parent. We must do pushdown before we've done JOIN::optimize() of the child (in particular, it must be done before we do update_ref_and_keys() and range analysis in the child). 5. Other things to take care of =============================== * Pushing down fulltext predicates (it seems one needs to "register" a fulltext predicate when it is moved from one select from another? Ask Serg) ESTIMATED WORK TIME ESTIMATED COMPLETION DATE ----------------------------------------------------------------------- WorkLog (v3.5.9)
Hi Sergey On 25/05/2010, at 6:59 AM, worklog-noreply@askmonty.org wrote:
TASK...........: Push conditions down into non-mergeable VIEWs when possible TASK ID........: 119 (http://askmonty.org/worklog/?tid=119)
Will this have any effect on a VIEW containing a UNION? Regards, Arjen. -- Arjen Lentz, Exec.Director @ Open Query (http://openquery.com) Exceptional Services for MySQL at a fixed budget. Follow our blog at http://openquery.com/blog/ OurDelta: packages for MySQL and MariaDB @ http://ourdelta.org
On Tue, May 25, 2010 at 11:00:02AM +1000, Arjen Lentz wrote:
Hi Sergey
On 25/05/2010, at 6:59 AM, worklog-noreply@askmonty.org wrote:
TASK...........: Push conditions down into non-mergeable VIEWs when possible TASK ID........: 119 (http://askmonty.org/worklog/?tid=119)
Will this have any effect on a VIEW containing a UNION?
This particular task won't, as we're looking to limit the scope of the WL, and UNIONs are an easy target. It is possible to push a condition inside UNION (as long as it's not UNION ... ORDER BY), i.e. (SELECT foo UNION SELECT bar) HAVING x --> (SELECT foo HAVING x) UNION (SELECT bar HAVING h) so we could add support for UNIONs after MWL#119 is complete. Arjen, do you mention UNIONs as something that's technically possible to handle, or you're saying that there are cases with UNIONs of practical importance? BR Sergey -- Sergey Petrunia, Software Developer Monty Program AB, http://askmonty.org Blog: http://s.petrunia.net/blog
participants (3)
-
Arjen Lentz
-
Sergey Petrunya
-
worklog-noreply@askmonty.org