[Maria-developers] More MWL#89 questions
Hi Timour, While merging, I also got the following questions: == exclude_expensive_cond == make_cond_for_table()'s exclude_expensive_cond parameter is not used anymore. Was it intentional that you kept it in the code? (minimizing MWL#89's diff size could not be a reason because the diff changes make_cond_for_table signature anyway) == exec_const_cond == The patch introduces JOIN::exec_const_cond. On the other hand, the code already had JOIN::outer_ref_cond which seems to be nearly the same. == join_tab_idx == make_join_select() attaches parts of WHERE/ON clauses to JOIN_TABs. It does so with a help of two functions: 1. make_cond_for_table() 2. make_cond_after_sjm() The first one does make set_join_tab_idx() calls, the second one doesn't. Why? == join_tab_idx #2 ===
@@ -7150,24 +7174,27 @@ there inside the triggers. */ { // Check const tables + join->exec_const_cond= + make_cond_for_table(thd, cond, join->const_table_map, - (table_map) 0, TRUE, FALSE); + (table_map) 0, MAX_TABLES, FALSE, FALSE);
As far as I understand the code, the MAX_TABLES part informs the constant subquery predicate that it will be evaluated O(join_output_records) times, which is not true. This is not the only such case, I see plenty of make_cond_for_table() calls which use MAX_TABLES for join_tab_idx argument, where we can actually tell for certain that the condition will be evaluated much fewer than O(join_output_records) times. BR Sergey -- Sergey Petrunia, Software Developer Monty Program AB, http://askmonty.org Blog: http://s.petrunia.net/blog
Hi Sergey,
Hi Timour,
While merging, I also got the following questions:
== exclude_expensive_cond == make_cond_for_table()'s exclude_expensive_cond parameter is not used anymore. Was it intentional that you kept it in the code? (minimizing MWL#89's diff size could not be a reason because the diff changes make_cond_for_table signature anyway)
Yes, this parameter was kept on purpose, mostly per Igor's request. Keeping it allows to use it later without changing the interface too many times (first remove it, then add it).
== exec_const_cond == The patch introduces JOIN::exec_const_cond. On the other hand, the code already had JOIN::outer_ref_cond which seems to be nearly the same.
I admit I searched for a similar class member, but didn't notice the one you mention. The comment: COND *outer_ref_cond; ///<part of conds containing only outer references doesn't make the similarity obvious. The two conditions are evaluated at pretty different phases, which makes it non-obvious if outer_ref_cond can be reused: - outer_ref_cond is evaluated during do_select(), which is much later than - exec_const_cond which is evaluated close to the beginning of JOIN::optimize I will add it to my post-beta TODO to check if outer_ref_cond can be reused.
== join_tab_idx == make_join_select() attaches parts of WHERE/ON clauses to JOIN_TABs. It does so with a help of two functions: 1. make_cond_for_table() 2. make_cond_after_sjm() The first one does make set_join_tab_idx() calls, the second one doesn't. Why?
Because I was not aware of the second. Having more than one function with the same or similar purpose usually leads to such omissions. That's why using methods and overloading helps to ensure such things don't happen. That being said, I will investigate this. Currently this cannot be a problem, because join_tab_idx is used only in the case of non-flattened subqueries. However, I agree that this is potentially a bug if one decides to use join_tab_idx for other purposes. AFAIR there were other derivatives of make_cond_for_table(), so I should check those as well. Added to my post-beta TODO.
== join_tab_idx #2 ===
@@ -7150,24 +7174,27 @@ there inside the triggers. */ { // Check const tables + join->exec_const_cond= + make_cond_for_table(thd, cond, join->const_table_map, - (table_map) 0, TRUE, FALSE); + (table_map) 0, MAX_TABLES, FALSE, FALSE);
As far as I understand the code, the MAX_TABLES part informs the constant subquery predicate that it will be evaluated O(join_output_records) times, which is not true.
This is not the only such case, I see plenty of make_cond_for_table() calls which use MAX_TABLES for join_tab_idx argument, where we can actually tell for certain that the condition will be evaluated much fewer than O(join_output_records) times.
The class member Item::join_tab_idx tells us to which JOIN_TAB a condition is attached (pushed) to in a JOIN plan. Later this is used to determine the number of evaluations of an IN predicate, based on its position in the JOIN plan. My reasoning was that if the 'used_table' argument of make_cond_for_table is 0, then since there are no tables, there is no specific JOIN_TAB we push the condition to. Following this approach, below I analyzed the places where (used_table != 0), but (join_tab_idx == MAX_TABLES), and noted where there is a possible bug. Let me know if you find any other places, or if my reasoning is incorrect. 1) JOIN::exec() ..... if (sort_table_cond) .... curr_join->tmp_having= make_cond_for_table(thd, curr_join->tmp_having, ~ (table_map) 0, ~used_tables, MAX_TABLES, FALSE, FALSE); The condition is a HAVING clause, is there a valid JOIN_TAB for it? I think there isn't. 2) int JOIN_TAB::make_scan_filter() This looks like a bug, there is a valid JOIN_TAB, which is 'this'. 3) make_join_select() .... COND *outer_ref_cond= make_cond_for_table(thd, cond, OUTER_REF_TABLE_BIT, OUTER_REF_TABLE_BIT, MAX_TABLES, FALSE, FALSE); There is no valid JOIN_TAB here, but I agree that this looks like a bug because the condition is constant with respect to the current JOIN. What would be a valid JOIN_TAB in this case? I could change join_tab_idx to be a count starting from 1, and use 0 to denote this case. The question is how this 0 would affect the users of join_tab_idx. What do you think? 4) make_join_select() .... /* Push condition to storage engine if this is enabled and the condition is not guarded */ if (thd->variables.engine_condition_pushdown && !first_inner_tab) { COND *push_cond= make_cond_for_table(thd, tmp, current_map, current_map, MAX_TABLES, FALSE, FALSE); Possible bug again. Do you agree that the index of the most significant bit in current_map is the correct value for join_tab_idx instead of MAX_TABLES? 5) fix_having() This is a dead function, I deliberately didn't bother with it. Timour
participants (2)
-
Sergey Petrunya
-
Timour Katchaounov