
Hi Sergei! Actually I completed the work on update and delete. Now they will use index for looking up records. But I am thinking I have done a lot of changes in optimizer which may break it , and also there are lots of queries where my code does not work, fixing this might take a long amount of time. I am thinking of a change in my existing code :- Suppose a table t1 create table t1 (a blob, b blob, c blob, unique(a,b,c)); In current code , for query like there will a KEY with only one keypart which points to field DB_ROW_HASH_1. It was okay for normal updates , insert and delete , but in the case of where optimization I have do a lot of stuff , first to match field (like in add_key_part), then see whether all the fields in hash_str are present in where or not, then create keys by calculating hash. I do this by checking the HA_UNIQUE_HASH flag in KEY , but this also makes (I think) optimizer code bad because of too much dependence. Also I need to patch get_mm_parts and get_mm_leaf function , which I think should not be patched. I am thinking of a another approach to this problem at server level instead of having just one keypart we can have 1+3 keypart. Last three keypart will be for field a, b, c and first one for DB_ROW_HASH_1 .These will be only at server level not at storage level. key_info->key_part will point at keypart containing field a , while key_part having field DB_ROW_HASH_1 will -1 index. By this way I do not have to patch more of optimizer code. But there is one problem , what should be the length of key_part? I am thinking of it equal to field->pack_length(), this would not work because while creating keys optimizer calls get_key_image() (which is real data so can exceed pack_lenght() in case of blob), so to get this work I have to patch optimizer where it calls get_key_image() and see if key is HA_UNIQUE_HASH . If yes then instead of get_key_image just use memcpy(key, field->ptr(), field->pack_length()); this wont copy the actual data, but we do not need actual data. I will patch handler methods like ha_index_read, ha_index_idx_read , multi_range_read_info_const basically handler methods which are related to index or range search. In these methods i need to calculate hash , which I can calculate from key_ptr but key_ptr doe not have actual data(in case of blobs etc).So to get the date for hash , I will make a field clone of (a,b,c etc) but there ptr will point in key_ptr. Then field->val_str() method will work simply and i can calculate hash. And also I can compare returned result with actual key in handler method itself. What do you think of this approach ? Regards sachin On Sat, Aug 20, 2016 at 11:16 PM, Sergei Golubchik <serg@mariadb.org> wrote:
Hi, Sachin!
On Aug 19, Sachin Setia wrote:
On Fri, Aug 19, 2016 at 2:42 PM, Sergei Golubchik <serg@mariadb.org> wrote:
First. I believe you'll need to do your final evaluation soon, and it will need to have a link to the code. Did you check google guidelines about it? Is everything clear there? Do you need help publishing your work in a format that google requires?
They don't accept delays for any reasons, so even if your code is not 100% complete and ready, you'd better still publish it and submit the evaluation, because otherwise google will fail you and that'd be too sad.
If you'd like you can publish the google-way only the unique-constraint part without further optimizer work. Or at least please mention that you'd completed the original project and went working on extensions. I mean, it's better than saying "the code is not 100% complete" :)
Okay I am thinking of writing a blog post with a link to my github repository. Blog Link <http://sachin1001gsoc.blogspot.in/2016/08/gsoc-2016.html> Please check this.
I think that'll do, yes.
Regards, Sergei Chief Architect MariaDB and security@mariadb.org