Re: [Maria-developers] Regarding approach used for Early NULL filtering in the range optimizer(MDEV-15777)
On Fri, May 24, 2019 at 10:08:04AM -0700, Igor Babaev wrote:
On 05/24/2019 04:03 AM, Varun Gupta wrote:
Hi Igor, After discussing with Sergey, we came up with these conclusions as to why we used the approach of going through all the keyuses in the KEYUSE array
Cases to consider: we have an index on column a
1) a OP const where OP can be (</<=/=/between/ etc on which we do range analysis) . This case is already handled by the range optimizer. We do not create a NULL rejecting predicate for such conditions.
2) eq_join conditions (tbl1.a = tbl2.col2) This is the specific case we have tried to implement in MDEV-15777, we create NULL rejecting predicates for the keyuses for each table and then feed these predicates to range optimizer There is no logical explanation for checking null rejection of fields used in equalities with the help of array keyuse.
The logic here is as follows: We already collect the set of KEYUSE::null_rejecting attributes. Its meaning is very close to what the patch needs. Because of this, the part of the patch for MDEV-15777 which analyzes the KEYUSE array is very small: 122 lines including the comment (I counted make_null_rejecting_conds() and add_cond()). I think, the CPU overhead is small, too.
3) a < func(b,c) we do not handle this case, because:
a) It is harder to process. We would need to process the whole WHERE clause to infer non-nullability of the condition. Example:
(t1.f2+1 < t1.f1+t1.f3) OR ... OR t1.f1 is null
here the left part of the OR condition doesn't allow f1 to be NULL, but the right condition allows it. We would also need to take into account tables inside outer joins. We would need to go through Item classes and add code which say "Item_xxx() will compute to false when its argument Y is null" (there is item->not_null_tables() currently, but not not_null_columns()).
Have you actually looked at the implementations of the virtual function not_null_tables()? Do you need me to provide the implementation of the virtual function not_null_columns()?
It will require multiple implementations. At the moment we have: nm mysqld --demangle | grep 'Item.*::not_null_tables' | wc -l 21 Another question: do you intend to collect not_null_columns() for all columns, or just columns that are a part of some index? What would a good data structure to store a set of tablename.column_name ?
b) the conditions in form of arbitrary functions are not as frequently used as ref access conditions.
to sum up a) and b) - doable but will require a lot of effort.
Do you have any specific practically relevant example in mind that we should handle?
BR Sergei -- Sergei Petrunia, Software Developer MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog
participants (1)
-
Sergey Petrunia