Hello, Here is the result of discussion on Join Buffering + Static Predicate Pushdown. == Problem setting == Consider a query using join buffering +------+-------------+-------+------+---------------+------+---------+------+------+-------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+------+------+-------------------------------------------------+ | 1 | SIMPLE | t1 | ALL | NULL | NULL | NULL | NULL | 2 | | | 1 | SIMPLE | t2 | ALL | NULL | NULL | NULL | NULL | 2 | Using where; Using join buffer (flat, BNL join) | | 1 | SIMPLE | t3 | ALL | NULL | NULL | NULL | NULL | 2 | Using join buffer (flat, BNL join) | +------+-------------+-------+------+---------------+------+---------+------+------+-------------------------------------------------+ Consider a predicate "P(t2)". Join buffering works as follows: read records from table t1 and put them into buffer; for each record in table t2 { if (cond(t2)) { for each record in the join buffer { if (cond(t1, t2) { pass record for further processing } } } } cond(2) is stored in "join_tab->cache_select->cond" cond(t1, t2) is stored in join_tab->select_cond". The question is, how should Static Predicate Pushdown work with join buffering? Should the condition be put into cache_select->cond or into select_cond of some later join_tab? Currently, the server will put the predicate P(t2) into both join_tab[t2]->select_cond and join_tab[t2]->cache_select->cond. (And subquery cache will wrap them into different caches) == Solution == First observation: if the expensive subquery predicate depends on several tables (not counting the const tables) then it will never be put into cache_select->cond. join_tab[$TBL]->cache_select->cond can only have conditions that depend only on table $TBL. If a predicate P(T) depends only on table $T, it can be put into 1. join_tab[$T]->cache_select->cond 2. join_tab[$T]->select_cond 3. join_tab[$T+1]->select_cond 4. join_tab[$T+2]->select_cond ... Putting the predicate into both #1 and #2 does not make sense. We have decided that Timour will remove duplicate predicate pushdown from cache_select and select_cond. After that feature, the predicates in cache_select->cond will not have duplicates in cache_select. That way, the options for Static Predicate Pushdown will be: 1. join_tab[$T]->cache_select->cond 3. join_tab[$T+1]->select_cond 4. join_tab[$T+2]->select_cond ... === Needed adjustments in optimizer === Consider the EXPLAIN again: +------+-------------+-------+------+---------------+------+---------+------+------+-------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+------+------+-------------------------------------------------+ | 1 | SIMPLE | t1 | ALL | NULL | NULL | NULL | NULL | 2 | | | 1 | SIMPLE | t2 | ALL | NULL | NULL | NULL | NULL | 2 | Using where; Using join buffer (flat, BNL join) | | 1 | SIMPLE | t3 | ALL | NULL | NULL | NULL | NULL | 2 | Using join buffer (flat, BNL join) | +------+-------------+-------+------+---------------+------+---------+------+------+-------------------------------------------------+ and a predicate P(t2). Currently, Static Predicate Pushdown will consider putting the predicate at: t2->select_cond: fanout(t1) * fanout(t2) predicate invocations t3->select_cond: fanout(t1) * fanout(t2)* fanout(t3) predicate invocations (here, fanout takes into account selectivity, so it can be that fanout <1) If P(t2) is not attached to t2->select_cond but to t2->cache_select->cond instead, the number of invocations will change to: t2->cache_select->cond: fanout(t1) / number_of_t1_records_in_join_buffer * fanout(t2). The above is obvious for non-BKA case. ==== A note about subquery predicate cache ==== [A theoretical comment] We're not going to do this, but: if P(t2) was attached [ONLY] to t2->select_cond, execution of the query will cause these invocations: P(t2.row1) ---+ P(t2.row1) |-- number_of_t1_records_in_join_buffer times ... | P(t2.row1) ----+ P(t2.row2) ----+ P(t2.row2) |-- number_of_t1_records_in_join_buffer times ... | P(t2.row2) ----+ ... etc. That is, subquery predicate cache will reduce the number of times the predicate is executed. == Applicability for join algorithms == Basic idea: when join buffer is using BKA, join_tab->cache_select==NULL, there is no cache_select. Details: Join algorithms where join_tab->cache_selct->cond is used: - JOIN_CACHE_BNL - JOIN_CACHE_BNLH - JOIN_CACHE_BKAH BR Sergei -- Sergei Petrunia, Software Developer MariaDB | Skype: sergefp | Blog: http://s.petrunia.net/blog