Hi, Sachin! On Mar 23, Sachin Setia wrote:
Hello Sergei Today I made some progress related to project. MyISAM/ARIA Got clear understanding of how to implement unique index for query like create table tbl(col1 int primary key , col2 blob ,col3 blob , unique(col2,col3)) InnoDB Reading about it.Actually Sir, I want to do this project whether I will select in gsoc or not(because InnoDB is amazing).
:)
Proposal Still Writing
don't miss the deadline :)
Actually sir i have one doubt in table2myisam function definition
recinfo_out, (share->fields * 2 + 2) * sizeof(MI_COLUMNDEF), ^^^^^ ^ ^ why we allocating these many number of recinfo because we only require share->fields + 1 .
good question :) I don't know. this line apparently goes back at least to 2001, there is no revision history beyond that date. It could be that it allocates twice as much columns as necessary nowadays, even if it made sense many years ago.
One more doubt in optimizing "select distinct coloumn_name(here it is a blob coloumn) from table" query. In mi write which take one record and write it we check for unique constraint. It takes O(n^2) time. I
This isnt O(n^2), because hashes are stored in the index, in a b-tree. So, it's O(n*log(n)).
was thinking if we can optimize this by first fetching the whole table record and calculating hash for each record.Instead of comparing one hash with all other we can sort the hashes and ignore the duplicate (we can make an array of 0 and 1 and if it 1 that means record is not duplicate and for 0 it is duplicte). by doing this we can reduce the time complexity to O(nlog(n)).
As you see, we already have O(n*log(n)). But if we put these hashes into a hash table in memory (instead of just sorting them), the cost will go down to O(n). Sounds interesting :) Regards, Sergei Chief Architect MariaDB and security@mariadb.org