[Maria-developers] GSoC [MDEV-6017]
Hello. This is the thread where I would like to publish my weekly reports every Monday. During preparatory weeks I set up development environment. I briefly examined testing framework, debugging process and now I am ready to start work. Work progress will be available on my github repository in branch "MDEV-6017" https://github.com/ZeroICQ/server/tree/MDEV-6017
вт, 28 мая 2019 г. в 06:44, Alexey Mogilyovkin
Hello. This is the thread where I would like to publish my weekly reports every Monday. During preparatory weeks I set up development environment. I briefly examined testing framework, debugging process and now I am ready to start work. Work progress will be available on my github repository in branch "MDEV-6017" https://github.com/ZeroICQ/server/tree/MDEV-6017
For the last week I was making some investigations about implementing my task. Here are my thoughts about how it can be done. I want to find all virtual columns that are used in most outer select. Then try to match expressions in where clauses with vcols and rewrite all matching ones, including ones in subselects. Then run new Item processor. When this processor encounters Item_subselect it finds all vcolumns again, but this time only that belongs to subselect. Then the process continues recursively. I currently trying to implement it. Code is available here https://github.com/ZeroICQ/server/compare/10.4...ZeroICQ:MDEV-6017 Selects with joins and unions would require some modifications. I had most troubles with getting all where clauses from subselects. I am not sure if "if ((*vf)->get_possible_keys().to_ulonglong())" is the right way to check if there are any suitable indexes for given virtual column. And I have some troubles understanding structure of LEX variable after parsing complex queries with joins, subselects and unions.
Hi, Alexey! On Jun 04, Alexey Mogilyovkin wrote:
For the last week I was making some investigations about implementing my task. Here are my thoughts about how it can be done. I want to find all virtual columns that are used in most outer select. Then try to match expressions in where clauses with vcols and rewrite all matching ones, including ones in subselects. Then run new Item processor. When this processor encounters Item_subselect it finds all vcolumns again, but this time only that belongs to subselect. Then the process continues recursively.
I'd suggest to ignore subselects now, and only rewrite the top-level select. After that'll work, you can look into subselects, but there's no need to try to do everything at once.
I currently trying to implement it. Code is available here https://github.com/ZeroICQ/server/compare/10.4...ZeroICQ:MDEV-6017 Selects with joins and unions would require some modifications.
normally, iterators are used like while (Item *item= it++) { ... } not with it.peek()
I had most troubles with getting all where clauses from subselects. I am not sure if "if ((*vf)->get_possible_keys().to_ulonglong())" is the
Better: if (!(*vf)->get_possible_keys().is_clear_all()) or simply if (!(*vf)->part_of_key.is_clear_all()) as get_possible_keys() is just confusing in this context.
right way to check if there are any suitable indexes for given virtual column. And I have some troubles understanding structure of LEX variable after parsing complex queries with joins, subselects and unions.
You might get more specific answers if you'd ask more specific questions :) Regards, Sergei Chief Architect MariaDB and security@mariadb.org
From messages in zulip I understood that some rewrites could not be used by
Week 2 report. This week was more productive than previous. I put aside complex expressions, subselection, unions etc. Took into account the comments about bad iterators usage and tried to fix it. I've also implemented two simple rewrite rules. 1) ... where 'expr' => ...where 'vfield'. 2) ... where 'expr' [>=, >, =, <=, <] 'some-value' => ... where ('expr' [>=, >, =, <=, <] 'some-value') and ('vfield' [>=, >, =, <=, <] 'some-value'). I am really not sure if rule №1 will somehow help optimizer. Probably it is useless. For the rule №2 I wrote test. Now I have failing tests gcol.gcol_keys_innodb gcol.gcol_keys_myisam. I will further research how my changes broke them. Also I've encountered strange error in travis-ci. Have no idea how to fix it. The command "env DEB_BUILD_OPTIONS="parallel=4" debian/autobake-deb.sh;" exited with 2. https://travis-ci.org/ZeroICQ/server/jobs/543019254 Initially I put my code into sql_parse, but now I think that a better approach will be to put it somewhere in optimizer. the optimizer. My first approach was to try to generalize all rewrite rules, but now I am totally confused about what rules to implement next week. Diff can be found here https://github.com/MariaDB/server/compare/10.4...ZeroICQ:MDEV-6017
Hi!
2) ... where 'expr' [>=, >, =, <=, <] 'some-value' => ... where ('expr' [>=, >, =, <=, <] 'some-value') and ('vfield' [>=, >, =, <=, <] 'some-value').
What about `where expr`, `where expr in expr` and whatnot?
My first approach was to try to generalize all rewrite rules, but now I am totally confused about what rules to implement next week.
I think your first approach was correct: just write a general substitution algorithm. The exclusion rules could be added later, on demand.
Now I have failing tests gcol.gcol_keys_innodb gcol.gcol_keys_myisam. I will further research how my changes broke them.
The optimizer is controlled by a set of switches -- try to add your own
switch and turn it off by default! You can turn it on only for your test
with SET command, like for most other system variables (don't forget to
turn it off after test)
Later it could be considered to be turned on by default, and turned off for
those two tests only (if that's not a bug ofc).
This will also partially solve the question about additional exclusion
rules. The general algorithm should always work correct, and not worse than
with no substitution (at least asymptotically). The User will have
possibility to turn it off for particular queries, if it's that dramatical
for the one.
On Tue, 11 Jun 2019 at 00:38, Alexey Mogilyovkin
Week 2 report. This week was more productive than previous. I put aside complex expressions, subselection, unions etc. Took into account the comments about bad iterators usage and tried to fix it. I've also implemented two simple rewrite rules. 1) ... where 'expr' => ...where 'vfield'. 2) ... where 'expr' [>=, >, =, <=, <] 'some-value' => ... where ('expr' [>=, >, =, <=, <] 'some-value') and ('vfield' [>=, >, =, <=, <] 'some-value').
I am really not sure if rule №1 will somehow help optimizer. Probably it is useless. For the rule №2 I wrote test. Now I have failing tests gcol.gcol_keys_innodb gcol.gcol_keys_myisam. I will further research how my changes broke them. Also I've encountered strange error in travis-ci. Have no idea how to fix it. The command "env DEB_BUILD_OPTIONS="parallel=4" debian/autobake-deb.sh;" exited with 2. https://travis-ci.org/ZeroICQ/server/jobs/543019254
Initially I put my code into sql_parse, but now I think that a better approach will be to put it somewhere in optimizer.
From messages in zulip I understood that some rewrites could not be used by the optimizer. My first approach was to try to generalize all rewrite rules, but now I am totally confused about what rules to implement next week. Diff can be found here https://github.com/MariaDB/server/compare/10.4...ZeroICQ:MDEV-6017
-- Yours truly, Nikita Malyavin
participants (3)
-
Alexey Mogilyovkin
-
Nikita Malyavin
-
Sergei Golubchik