[Maria-developers] More MWL#89 questions
Hi Timour, While merging, I also got the following questions: == exclude_expensive_cond == make_cond_for_table()'s exclude_expensive_cond parameter is not used anymore. Was it intentional that you kept it in the code? (minimizing MWL#89's diff size could not be a reason because the diff changes make_cond_for_table signature anyway) == exec_const_cond == The patch introduces JOIN::exec_const_cond. On the other hand, the code already had JOIN::outer_ref_cond which seems to be nearly the same. == join_tab_idx == make_join_select() attaches parts of WHERE/ON clauses to JOIN_TABs. It does so with a help of two functions: 1. make_cond_for_table() 2. make_cond_after_sjm() The first one does make set_join_tab_idx() calls, the second one doesn't. Why? == join_tab_idx #2 ===
@@ -7150,24 +7174,27 @@ there inside the triggers. */ { // Check const tables + join->exec_const_cond= + make_cond_for_table(thd, cond, join->const_table_map, - (table_map) 0, TRUE, FALSE); + (table_map) 0, MAX_TABLES, FALSE, FALSE);
As far as I understand the code, the MAX_TABLES part informs the constant subquery predicate that it will be evaluated O(join_output_records) times, which is not true. This is not the only such case, I see plenty of make_cond_for_table() calls which use MAX_TABLES for join_tab_idx argument, where we can actually tell for certain that the condition will be evaluated much fewer than O(join_output_records) times. BR Sergey -- Sergey Petrunia, Software Developer Monty Program AB, http://askmonty.org Blog: http://s.petrunia.net/blog
Hi Sergey,
Hi Timour,
While merging, I also got the following questions:
== exclude_expensive_cond == make_cond_for_table()'s exclude_expensive_cond parameter is not used anymore. Was it intentional that you kept it in the code? (minimizing MWL#89's diff size could not be a reason because the diff changes make_cond_for_table signature anyway)
Yes, this parameter was kept on purpose, mostly per Igor's request. Keeping it allows to use it later without changing the interface too many times (first remove it, then add it).
== exec_const_cond == The patch introduces JOIN::exec_const_cond. On the other hand, the code already had JOIN::outer_ref_cond which seems to be nearly the same.
I admit I searched for a similar class member, but didn't notice the
one you mention. The comment:
COND *outer_ref_cond; /// == join_tab_idx ==
make_join_select() attaches parts of WHERE/ON clauses to JOIN_TABs. It does
so with a help of two functions:
1. make_cond_for_table()
2. make_cond_after_sjm()
The first one does make set_join_tab_idx() calls, the second one doesn't. Why? Because I was not aware of the second. Having more than one function with
the same or similar purpose usually leads to such omissions. That's why using
methods and overloading helps to ensure such things don't happen. That being
said, I will investigate this.
Currently this cannot be a problem, because join_tab_idx is used only in
the case of non-flattened subqueries. However, I agree that this is potentially
a bug if one decides to use join_tab_idx for other purposes.
AFAIR there were other derivatives of make_cond_for_table(), so I should
check those as well.
Added to my post-beta TODO. == join_tab_idx #2 === @@ -7150,24 +7174,27 @@
there inside the triggers.
*/
{ // Check const tables
+ join->exec_const_cond=
+ make_cond_for_table(thd, cond,
join->const_table_map,
- (table_map) 0, TRUE, FALSE);
+ (table_map) 0, MAX_TABLES, FALSE, FALSE); As far as I understand the code, the MAX_TABLES part informs the constant
subquery predicate that it will be evaluated O(join_output_records) times,
which is not true. This is not the only such case, I see plenty of make_cond_for_table() calls
which use MAX_TABLES for join_tab_idx argument, where we can actually tell for
certain that the condition will be evaluated much fewer than
O(join_output_records) times. The class member Item::join_tab_idx tells us to which JOIN_TAB a condition
is attached (pushed) to in a JOIN plan. Later this is used to determine
the number of evaluations of an IN predicate, based on its position in
the JOIN plan.
My reasoning was that if the 'used_table' argument of make_cond_for_table
is 0, then since there are no tables, there is no specific JOIN_TAB we push
the condition to. Following this approach, below I analyzed the places where
(used_table != 0), but (join_tab_idx == MAX_TABLES), and noted where there
is a possible bug. Let me know if you find any other places, or if my
reasoning is incorrect.
1)
JOIN::exec()
.....
if (sort_table_cond)
....
curr_join->tmp_having= make_cond_for_table(thd, curr_join->tmp_having,
~ (table_map) 0,
~used_tables, MAX_TABLES,
FALSE, FALSE);
The condition is a HAVING clause, is there a valid JOIN_TAB for it? I think
there isn't.
2)
int JOIN_TAB::make_scan_filter()
This looks like a bug, there is a valid JOIN_TAB, which is 'this'.
3)
make_join_select()
....
COND *outer_ref_cond= make_cond_for_table(thd, cond,
OUTER_REF_TABLE_BIT,
OUTER_REF_TABLE_BIT,
MAX_TABLES, FALSE, FALSE);
There is no valid JOIN_TAB here, but I agree that this looks like a bug
because the condition is constant with respect to the current JOIN.
What would be a valid JOIN_TAB in this case? I could change join_tab_idx
to be a count starting from 1, and use 0 to denote this case. The question
is how this 0 would affect the users of join_tab_idx.
What do you think?
4)
make_join_select()
....
/* Push condition to storage engine if this is enabled
and the condition is not guarded */
if (thd->variables.engine_condition_pushdown && !first_inner_tab)
{
COND *push_cond=
make_cond_for_table(thd, tmp, current_map, current_map,
MAX_TABLES, FALSE, FALSE);
Possible bug again. Do you agree that the index of the most significant bit
in current_map is the correct value for join_tab_idx instead of MAX_TABLES?
5) fix_having()
This is a dead function, I deliberately didn't bother with it.
Timour
participants (2)
-
Sergey Petrunya
-
Timour Katchaounov