Hello Sergei!
Thanks for selecting me in gsoc.
Actually Recently I was reading about redis database.
I found it very fast ,can we  do something like in maridb 
to make it fast. Like adding a new transaction isolation level in
innodb which will not be acid compliant to undo logs or something 
like that
Regards
sachin

On Tue, Apr 19, 2016 at 7:24 PM, Sachin Setia <sachinsetia1001@gmail.com> wrote:
Hello Sergei!
I am sorry , I pressed Ctrl + Enter so first message is an incomplete message
Actually first I was thinking instead of storing hash as b+tree store it in hash table
But then i come to know that we can have range query in blob fields If we use hash table
then range query will take O(n) time. I was reading mysql manual then i come across this 
term this will monitor mysql query and if it finds that there is lot of query using in or equal
then it will build hash index from b+tree index.I want to implement same thing in myisam 

But I am not sure that this will be good idea.What do you think sir
Currently I am reading about virtual columns if i make some progress i will let you know 
Regards 
sachin

On Tue, Apr 19, 2016 at 7:10 PM, Sachin Setia <sachinsetia1001@gmail.com> wrote:
Hello Sergei!
By adaptive hashing I refer to something like in innodb adaptive hashing


On Tue, Apr 19, 2016 at 6:55 PM, sachin setiya <sachinsetia1001@gmail.com> wrote:
Hello


---------- Forwarded message ----------
From: Sachin Setia <sachinsetia1001@gmail.com>
To: maria-developers@lists.launchpad.net
Cc: 
Date: Mon, 11 Apr 2016 15:03:24 +0530
Subject: InnoDB blob for primary key
Hello Developers,
Hi this is sachin.Actually i was currently experimenting with with
blob uniques in innodb
there is three main problems
1.Unique blob
2.Blob Forigen key
3.Blob primary key

1. For blob unique we can simply store hash in unclustered index
2. Blob Forigen key i am currently working on it
3. Blob primary key :- for this i thought we create a 4 byte column
which stores the hash of blob primary key.Internally this column will work as
primary key and key for clustered index. I already successufully tested this
here is my output

MariaDB [sachin]> create table t4 (abc blob primary key);
Query OK, 0 rows affected (0.10 sec)

MariaDB [sachin]> insert into t4 values('sachin');
Query OK, 1 row affected (0.01 sec)

MariaDB [sachin]> insert into t4 values('sachin');
ERROR 1062 (23000): Duplicate entry 'sachin' for key 'PRIMARY'
MariaDB [sachin]> insert into t4 values('sachin setiya');
Query OK, 1 row affected (0.00 sec)

MariaDB [sachin]> insert into t4 values('sachin setiya');
ERROR 1062 (23000): Duplicate entry 'sachin setiya' for key 'PRIMARY'

MariaDB [sachin]> desc t4;
+-------+------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+------+------+-----+---------+-------+
| abc   | blob | NO   | PRI | NULL    |       |
+-------+------+------+-----+---------+-------+
1 row in set (0.01 sec)

MariaDB [sachin]> select * from t4;
+---------------+
| abc           |
+---------------+
| sachin        |
| sachin setiya |
+---------------+
2 rows in set (0.01 sec)

@Sergei hi! Actually i invested arround 2 months in mariadb So for me now it  does not matter either i got selected in gsoc i want to do work in innodb blob unique from today.Please sir allow me to do this

I am including the patch file and t4.ibd and t4.frm file
Regards
sachin




---------- Forwarded message ----------
From: "Jan Lindström" <jan.lindstrom@mariadb.com>
To: Sachin Setia <sachinsetia1001@gmail.com>
Cc: "maria-developers@lists.launchpad.net" <maria-developers@lists.launchpad.net>
Date: Tue, 12 Apr 2016 14:22:29 +0300
Subject: Re: [Maria-developers] InnoDB blob for primary key
Hi,

Sachin, naturally you may continue, this is open source, please read https://mariadb.com/kb/en/mariadb/community-contributing-to-the-mariadb-project/  From InnoDB point of view there is issue if you add a new system generated column to the row. Firstly, existing tables will not have that column and secondly it will effect e.g maximum row size and external tools like backup.

R: Jan

On Mon, Apr 11, 2016 at 12:33 PM, Sachin Setia <sachinsetia1001@gmail.com> wrote:
Hello Developers,
Hi this is sachin.Actually i was currently experimenting with with
blob uniques in innodb
there is three main problems
1.Unique blob
2.Blob Forigen key
3.Blob primary key

1. For blob unique we can simply store hash in unclustered index
2. Blob Forigen key i am currently working on it
3. Blob primary key :- for this i thought we create a 4 byte column
which stores the hash of blob primary key.Internally this column will work as
primary key and key for clustered index. I already successufully tested this
here is my output

MariaDB [sachin]> create table t4 (abc blob primary key);
Query OK, 0 rows affected (0.10 sec)

MariaDB [sachin]> insert into t4 values('sachin');
Query OK, 1 row affected (0.01 sec)

MariaDB [sachin]> insert into t4 values('sachin');
ERROR 1062 (23000): Duplicate entry 'sachin' for key 'PRIMARY'
MariaDB [sachin]> insert into t4 values('sachin setiya');
Query OK, 1 row affected (0.00 sec)

MariaDB [sachin]> insert into t4 values('sachin setiya');
ERROR 1062 (23000): Duplicate entry 'sachin setiya' for key 'PRIMARY'

MariaDB [sachin]> desc t4;
+-------+------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+------+------+-----+---------+-------+
| abc   | blob | NO   | PRI | NULL    |       |
+-------+------+------+-----+---------+-------+
1 row in set (0.01 sec)

MariaDB [sachin]> select * from t4;
+---------------+
| abc           |
+---------------+
| sachin        |
| sachin setiya |
+---------------+
2 rows in set (0.01 sec)

@Sergei hi! Actually i invested arround 2 months in mariadb So for me now it  does not matter either i got selected in gsoc i want to do work in innodb blob unique from today.Please sir allow me to do this

I am including the patch file and t4.ibd and t4.frm file
Regards
sachin



_______________________________________________
Mailing list: https://launchpad.net/~maria-developers
Post to     : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp




---------- Forwarded message ----------
From: Sergei Golubchik <vuvova@gmail.com>
To: Sachin Setia <sachinsetia1001@gmail.com>
Cc: 
Date: Wed, 13 Apr 2016 00:04:22 +0200
Subject: Re: [Maria-developers] InnoDB blob for primary key
Hi, Sachin!

Just a very quick comment now. We're having a company meeting
(and a meetup - http://www.meetup.com/MariaDB-Developers-Berlin-Meetup/events/230026151/)
in Berlin, I'll send you a longer reply when this is all over (in a
couple of days).

It might be an interesting option not to hack InnoDB, but to add
this feature (long UNIQUE keys) on the upper level. That is, in sql/
directory. Automatically add a new BIGINT column, but hide it from
the user (not show it in SHOW COLUMNS and in SELECT *). Create
an index for it (also hidden). Put the blob's hash into this column.
And on collisions retrieve both rows and compare blobs. Just like MyISAM
is doing, but not on the storage engine level, but in the
engine-independent fashion, something that works for all engines.

Want to look at this approach?

MySQL 5.7 has something similar (long unique constraints on the sql
layer), but they've done it only for internal temporary tables (like
MyISAM's MI_UNIQUE is only used for internal temporary tables),
and it's not directly available for the user. Still, it might give you
some hinst about how to implement this feature on the sql layer.

Regards,
Sergei
Chief Architect MariaDB
and security@mariadb.org

On Apr 11, Sachin Setia wrote:
> Hello Developers,
> Hi this is sachin.Actually i was currently experimenting with with
> blob uniques in innodb
> there is three main problems
> 1.Unique blob
> 2.Blob Forigen key
> 3.Blob primary key
>
> 1. For blob unique we can simply store hash in unclustered index
> 2. Blob Forigen key i am currently working on it
> 3. Blob primary key :- for this i thought we create a 4 byte column
> which stores the hash of blob primary key.Internally this column will work
> as
> primary key and key for clustered index. I already successufully tested
> this
> here is my output
>
> MariaDB [sachin]> create table t4 (abc blob primary key);
> Query OK, 0 rows affected (0.10 sec)
>
> MariaDB [sachin]> insert into t4 values('sachin');
> Query OK, 1 row affected (0.01 sec)
>
> MariaDB [sachin]> insert into t4 values('sachin');
> ERROR 1062 (23000): Duplicate entry 'sachin' for key 'PRIMARY'
> MariaDB [sachin]> insert into t4 values('sachin setiya');
> Query OK, 1 row affected (0.00 sec)
>
> MariaDB [sachin]> insert into t4 values('sachin setiya');
> ERROR 1062 (23000): Duplicate entry 'sachin setiya' for key 'PRIMARY'
>
> MariaDB [sachin]> desc t4;
> +-------+------+------+-----+---------+-------+
> | Field | Type | Null | Key | Default | Extra |
> +-------+------+------+-----+---------+-------+
> | abc   | blob | NO   | PRI | NULL    |       |
> +-------+------+------+-----+---------+-------+
> 1 row in set (0.01 sec)
>
> MariaDB [sachin]> select * from t4;
> +---------------+
> | abc           |
> +---------------+
> | sachin        |
> | sachin setiya |
> +---------------+
> 2 rows in set (0.01 sec)
>
> @Sergei hi! Actually i invested arround 2 months in mariadb So for me now
> it  does not matter either i got selected in gsoc i want to do work in
> innodb blob unique from today.Please sir allow me to do this
>
> I am including the patch file and t4.ibd and t4.frm file
> Regards
> sachin

> diff --git a/sql/sql_table.cc b/sql/sql_table.cc
> index dfce503..efd6f22 100644
> --- a/sql/sql_table.cc
> +++ b/sql/sql_table.cc
> @@ -3244,7 +3244,7 @@ mysql_prepare_create_table(THD *thd, HA_CREATE_INFO *create_info,
>       !(sql_field->charset= find_bin_collation(sql_field->charset)))
>        DBUG_RETURN(TRUE);
>
> -    /*
> +    /*
>        Convert the default value from client character
>        set into the column character set if necessary.
>      */
> @@ -3874,11 +3874,11 @@ mysql_prepare_create_table(THD *thd, HA_CREATE_INFO *create_info,
>            if (f_is_geom(sql_field->pack_flag) && sql_field->geom_type ==
>                Field::GEOM_POINT)
>              column->length= MAX_LEN_GEOM_POINT_FIELD;
> -       if (!column->length)
> -       {
> -         my_error(ER_BLOB_KEY_WITHOUT_LENGTH, MYF(0), column->field_name.str);
> -         DBUG_RETURN(TRUE);
> -       }
> +//     if (!column->length) //change
> +//     {
> +//       my_error(ER_BLOB_KEY_WITHOUT_LENGTH, MYF(0), column->field_name.str);
> +//       DBUG_RETURN(TRUE);
> +//     }
>       }
>  #ifdef HAVE_SPATIAL
>       if (key->type == Key::SPATIAL)
> @@ -3992,9 +3992,9 @@ mysql_prepare_create_table(THD *thd, HA_CREATE_INFO *create_info,
>        }
>        else if (key_part_length == 0 && (sql_field->flags & NOT_NULL_FLAG))
>        {
> -     my_error(ER_WRONG_KEY_COLUMN, MYF(0), file->table_type(),
> -                 column->field_name.str);
> -       DBUG_RETURN(TRUE);
> +//   my_error(ER_WRONG_KEY_COLUMN, MYF(0), file->table_type(),
> +//                 column->field_name.str);
> +//     DBUG_RETURN(TRUE);
>        }
>        if (key_part_length > file->max_key_part_length() &&
>            key->type != Key::FULLTEXT)
> diff --git a/storage/innobase/dict/dict0dict.cc b/storage/innobase/dict/dict0dict.cc
> index c51deb0..f6e94af 100644
> --- a/storage/innobase/dict/dict0dict.cc
> +++ b/storage/innobase/dict/dict0dict.cc
> @@ -1315,7 +1315,7 @@ dict_table_add_to_cache(
>  #define BIG_ROW_SIZE 1024
>
>       ut_ad(mutex_own(&(dict_sys->mutex)));
> -
> +
>       dict_table_add_system_columns(table, heap);
>
>       table->cached = TRUE;
> diff --git a/storage/xtradb/dict/dict0dict.cc b/storage/xtradb/dict/dict0dict.cc
> index 206038d..d8223d7 100644
> --- a/storage/xtradb/dict/dict0dict.cc
> +++ b/storage/xtradb/dict/dict0dict.cc
> @@ -1251,7 +1251,7 @@ dict_table_open_on_name(
>  Adds system columns to a table object. */
>  UNIV_INTERN
>  void
> -dict_table_add_system_columns(
> +dict_table_add_system_columns( //work
>  /*==========================*/
>       dict_table_t*   table,  /*!< in/out: table */
>       mem_heap_t*     heap)   /*!< in: temporary heap */
> @@ -1266,16 +1266,25 @@ dict_table_add_system_columns(
>       etc.) and as the last columns of the table memory object.
>       The clustered index will not always physically contain all
>       system columns. */
> -
> +//   dict_mem_table_add_col(table,heap,"DB_BLOB_HASH",DATA_INT
> +//   ,DATA_NOT_NULL,DATA_BLOB_HASH_LEN);
>       dict_mem_table_add_col(table, heap, "DB_ROW_ID", DATA_SYS,
>                              DATA_ROW_ID | DATA_NOT_NULL,
>                              DATA_ROW_ID_LEN);
>  #if DATA_ROW_ID != 0
>  #error "DATA_ROW_ID != 0"
>  #endif
> +//   dict_mem_table_add_col(table,heap,"DB_BLOB_HASH",DATA_SYS,
> +//                                                   DATA_BLOB_HASH|DATA_NOT_NULL,
> +//                                                   DATA_BLOB_HASH_LEN);
> +     //just a rough trick to get it working
> +
> +//   if(*table->name=='y'){
>       dict_mem_table_add_col(table, heap, "DB_TRX_ID", DATA_SYS,
>                              DATA_TRX_ID | DATA_NOT_NULL,
>                              DATA_TRX_ID_LEN);
> +//   }
> +
>  #if DATA_TRX_ID != 1
>  #error "DATA_TRX_ID != 1"
>  #endif
> @@ -1310,7 +1319,11 @@ dict_table_add_to_cache(
>       ulint   row_len;
>
>       ut_ad(dict_lru_validate());
> -
> +//   bool x =false;//break
> +//   if(x){
> +//           dict_mem_table_add_col(table,heap,"DB_BLOB_HASH",DATA_INT
> +//                                                           ,DATA_NOT_NULL,4);
> +//   }
>       /* The lower limit for what we consider a "big" row */
>  #define BIG_ROW_SIZE 1024
>
> @@ -3075,7 +3088,7 @@ dict_index_build_internal_clust(
>       /* Copy the fields of index */
>       dict_index_copy(new_index, index, table, 0, index->n_fields);
>
> -     if (dict_index_is_univ(index)) {
> +     if (dict_index_is_univ(index)) { //work
>               /* No fixed number of fields determines an entry uniquely */
>
>               new_index->n_uniq = REC_MAX_N_FIELDS;
> @@ -3124,7 +3137,7 @@ dict_index_build_internal_clust(
>                                                         DATA_ROLL_PTR),
>                                  0);
>
> -             for (i = 0; i < trx_id_pos; i++) {
> +             for (i = 0; i < trx_id_pos; i++) {//work i think i need to do some stuff
>
>                       ulint fixed_size = dict_col_get_fixed_size(
>                               dict_index_get_nth_col(new_index, i),
> diff --git a/storage/xtradb/dict/dict0load.cc b/storage/xtradb/dict/dict0load.cc
> index d6ed8ac..8cc59f8 100644
> --- a/storage/xtradb/dict/dict0load.cc
> +++ b/storage/xtradb/dict/dict0load.cc
> @@ -2276,7 +2276,7 @@ dictionary cache.
>  ibd_file_missing flag TRUE in the table object we return */
>  UNIV_INTERN
>  dict_table_t*
> -dict_load_table(
> +dict_load_table( //work for corruped table
>  /*============*/
>       const char*     name,   /*!< in: table name in the
>                               databasename/tablename format */
> @@ -2337,7 +2337,7 @@ dict_load_table(
>               btr_pcur_close(&pcur);
>               mtr_commit(&mtr);
>               mem_heap_free(heap);
> -
> +
>               return(NULL);
>       }
>
> diff --git a/storage/xtradb/dict/dict0mem.cc b/storage/xtradb/dict/dict0mem.cc
> index a4f6cd6..6cbe556 100644
> --- a/storage/xtradb/dict/dict0mem.cc
> +++ b/storage/xtradb/dict/dict0mem.cc
> @@ -101,11 +101,11 @@ dict_mem_table_create(
>       memcpy(table->name, name, strlen(name) + 1);
>       table->is_system_db = dict_mem_table_is_system(table->name);
>       table->space = (unsigned int) space;
> -     table->n_cols = (unsigned int) (n_cols + DATA_N_SYS_COLS);
> -
> +     table->n_cols = (unsigned int) (n_cols + DATA_N_SYS_COLS); //work
> +
>       table->cols = static_cast<dict_col_t*>(
>               mem_heap_alloc(heap,
> -                            (n_cols + DATA_N_SYS_COLS)
> +                            (n_cols + DATA_N_SYS_COLS)//work
>                              * sizeof(dict_col_t)));
>
>       ut_d(table->magic_n = DICT_TABLE_MAGIC_N);
> diff --git a/storage/xtradb/handler/ha_innodb.cc b/storage/xtradb/handler/ha_innodb.cc
> index e5edf76..3a8dc91 100644
> --- a/storage/xtradb/handler/ha_innodb.cc
> +++ b/storage/xtradb/handler/ha_innodb.cc
> @@ -5983,7 +5983,7 @@ ha_innobase::open(
>       ibool                   par_case_name_set = FALSE;
>       char                    par_case_name[FN_REFLEN];
>       dict_err_ignore_t       ignore_err = DICT_ERR_IGNORE_NONE;
> -
> +     ibool is_blob_primary_key=false;
>       DBUG_ENTER("ha_innobase::open");
>
>       UT_NOT_USED(mode);
> @@ -6031,13 +6031,20 @@ ha_innobase::open(
>
>       /* Get pointer to a table object in InnoDB dictionary cache */
>       ib_table = dict_table_open_on_name(norm_name, FALSE, TRUE, ignore_err);
> -
> +     //int number_of_columns = dict_table_get_n_user_cols(ib_table);
> +     //if(ib_table->)
> +     if(ib_table){
> +             int number_of_columns = dict_table_get_n_user_cols(ib_table);
> +             if(!innobase_strcasecmp(ib_table->col_names,"DB_BLOB_HASH")){
> +                     is_blob_primary_key=true;
> +                     number_of_columns--;//stodo i think we need to add flag for blob primary key to make checking easier
> +             }
>       if (ib_table
> -         && ((!DICT_TF2_FLAG_IS_SET(ib_table, DICT_TF2_FTS_HAS_DOC_ID)
> -              && table->s->stored_fields != dict_table_get_n_user_cols(ib_table))
> +         && ((!DICT_TF2_FLAG_IS_SET(ib_table, DICT_TF2_FTS_HAS_DOC_ID))
> +              && table->s->stored_fields != number_of_columns) //work
>               || (DICT_TF2_FLAG_IS_SET(ib_table, DICT_TF2_FTS_HAS_DOC_ID)
>                   && (table->s->fields
> -                     != dict_table_get_n_user_cols(ib_table) - 1)))) {
> +                     != dict_table_get_n_user_cols(ib_table) - 1))) {
>               ib_logf(IB_LOG_LEVEL_WARN,
>                       "table %s contains %lu user defined columns "
>                       "in InnoDB, but %lu columns in MySQL. Please "
> @@ -6054,7 +6061,7 @@ ha_innobase::open(
>               ib_table = NULL;
>               is_part = NULL;
>       }
> -
> +     }
>       if (UNIV_UNLIKELY(ib_table && ib_table->is_corrupt &&
>                         srv_pass_corrupt_table <= 1)) {
>               free_share(share);
> @@ -6254,12 +6261,12 @@ ha_innobase::open(
>       /* Looks like MySQL-3.23 sometimes has primary key number != 0 */
>       primary_key = table->s->primary_key;
>       key_used_on_scan = primary_key;
> -
> +     if(!is_blob_primary_key){
>       if (!innobase_build_index_translation(table, ib_table, share)) {
>                 sql_print_error("Build InnoDB index translation table for"
>                                 " Table %s failed", name);
>       }
> -
> +     }
>       /* Allocate a buffer for a 'row reference'. A row reference is
>       a string of bytes of length ref_length which uniquely specifies
>       a row in our table. Note that MySQL may also compare two row
> @@ -6314,7 +6321,11 @@ ha_innobase::open(
>                       for (uint i = 0; i < table->s->keys; i++) {
>                               dict_index_t*   index;
>                               index = innobase_get_index(i);
> -                             if (dict_index_is_clust(index)) {
> +                             if (dict_index_is_clust(index)) { //work
> +                             if(is_blob_primary_key){
> +                                     ref_length=4; //hash length
> +                                     continue;
> +                             }
>                                       ref_length =
>                                                table->key_info[i].key_length;
>                               }
> @@ -7795,7 +7806,8 @@ build_template_field(
>
>       //ut_ad(field == table->field[i]);
>       ut_ad(clust_index->table == index->table);
> -
> +//work here we go
> +//todo it should get the next column defs
>       col = dict_table_get_nth_col(index->table, i);
>
>       templ = prebuilt->mysql_template + prebuilt->n_template++;
> @@ -8138,7 +8150,8 @@ ha_innobase::build_template(
>                                       continue;
>                               }
>                       }
> -
> +                             if(!innobase_strcasecmp(clust_index->fields->name,"DB_BLOB_HASH"))
> +                                     i++;
>                       build_template_field(prebuilt, clust_index, index,
>                                            table, field, i);
>               }
> @@ -11065,6 +11078,7 @@ create_table_def(
>       ulint           doc_id_col = 0;
>       ibool           has_doc_id_col = FALSE;
>       mem_heap_t*     heap;
> +     bool is_blob_primary_key=false;
>
>       DBUG_ENTER("create_table_def");
>       DBUG_PRINT("enter", ("table_name: %s", table_name));
> @@ -11126,6 +11140,12 @@ create_table_def(
>                       table->fts->doc_col = doc_id_col;
>               }
>       } else {
> +             if(form->key_info[0].key_part->length==0 && //change
> +             form->key_info[0].key_part->key_type|MYSQL_TYPE_BLOB){
> +                     s_cols++;
> +                     is_blob_primary_key=true;
> +             }
> +
>               table = dict_mem_table_create(table_name, 0, s_cols,
>                                             flags, flags2);
>       }
> @@ -11143,6 +11163,13 @@ create_table_def(
>               table->data_dir_path = NULL;
>       }
>       heap = mem_heap_create(1000);
> +     //work
> +     //add one more column for hash
> +
> +     if(is_blob_primary_key){
> +             dict_mem_table_add_col(table,heap,"DB_BLOB_HASH",
> +     DATA_INT,1283,4);
> +     }
>
>       for (i = 0; i < n_cols; i++) {
>               Field*  field = form->field[i];
> @@ -11222,7 +11249,7 @@ create_table_def(
>                       err = DB_ERROR;
>                       goto error_ret;
>               }
> -
> +
>               dict_mem_table_add_col(table, heap,
>                       field->field_name,
>                       col_type,
> @@ -11347,7 +11374,7 @@ create_index(
>               the length of the key part versus the column. */
>
>               Field*  field = NULL;
> -
> +//work
>               for (ulint j = 0; j < form->s->fields; j++) {
>
>                       field = form->field[j];
> @@ -11396,7 +11423,12 @@ create_index(
>               }
>
>               field_lengths[i] = key_part->length;
> -
> +             if(form->key_info[0].key_part->length==0 && //change
> +             form->key_info[0].key_part->key_type|MYSQL_TYPE_BLOB){
> +                     dict_mem_index_add_field(
> +                     index,"DB_BLOB_HASH", 0);
> +                     continue;
> +             }
>               dict_mem_index_add_field(
>                       index, key_part->field->field_name, prefix_len);
>       }
> diff --git a/storage/xtradb/include/data0type.h b/storage/xtradb/include/data0type.h
> index 111664b..0f510e8 100644
> --- a/storage/xtradb/include/data0type.h
> +++ b/storage/xtradb/include/data0type.h
> @@ -147,6 +147,9 @@ be less than 256 */
>  #define      DATA_ROLL_PTR   2       /* rollback data pointer: 7 bytes */
>  #define DATA_ROLL_PTR_LEN 7
>
> +//#define DATA_BLOB_HASH  3        // hash coloumn for blob primay key
> +//#define DATA_BLOB_HASH_LEN 4       //used as a clustered index
> +
>  #define      DATA_N_SYS_COLS 3       /* number of system columns defined above */
>
>  #define DATA_FTS_DOC_ID      3       /* Used as FTS DOC ID column */
> diff --git a/storage/xtradb/que/que0que.cc b/storage/xtradb/que/que0que.cc
> index e2dc023..076785d 100644
> --- a/storage/xtradb/que/que0que.cc
> +++ b/storage/xtradb/que/que0que.cc
> @@ -1030,7 +1030,7 @@ que_thr_step(
>               que_node_print_info(node);
>       }
>  #endif
> -     if (type & QUE_NODE_CONTROL_STAT) {
> +     if (type & QUE_NODE_CONTROL_STAT) { //improve
>               if ((thr->prev_node != que_node_get_parent(node))
>                   && que_node_get_next(thr->prev_node)) {
>
> diff --git a/storage/xtradb/row/row0ins.cc b/storage/xtradb/row/row0ins.cc
> index d02e179..c58d7fb 100644
> --- a/storage/xtradb/row/row0ins.cc
> +++ b/storage/xtradb/row/row0ins.cc
> @@ -137,7 +137,7 @@ row_ins_alloc_sys_fields(
>  /*=====================*/
>       ins_node_t*     node)   /*!< in: insert node */
>  {
> -     dtuple_t*               row;
> +     dtuple_t*               row;  //work
>       dict_table_t*           table;
>       mem_heap_t*             heap;
>       const dict_col_t*       col;
> @@ -3118,7 +3118,7 @@ row_ins_index_entry_set_vals(
>
>       n_fields = dtuple_get_n_fields(entry);
>
> -     for (i = 0; i < n_fields; i++) {
> +     for (i = 0; i < n_fields; i++) {//see
>               dict_field_t*   ind_field;
>               dfield_t*       field;
>               const dfield_t* row_field;
> @@ -3169,8 +3169,8 @@ row_ins_index_entry_step(
>
>       ut_ad(dtuple_check_typed(node->row));
>
> -     row_ins_index_entry_set_vals(node->index, node->entry, node->row);
> -
> +     row_ins_index_entry_set_vals(node->index, node->entry, node->row);//explore
> +//need to see who sets the entry in row
>       ut_ad(dtuple_check_typed(node->entry));
>
>       err = row_ins_index_entry(node->index, node->entry, thr);
> @@ -3208,7 +3208,7 @@ row_ins_alloc_row_id_step(
>
>       /* Fill in row id value to row */
>
> -     row_id = dict_sys_get_new_row_id();
> +     row_id = dict_sys_get_new_row_id(); //work
>
>       dict_sys_write_row_id(node->row_id_buf, row_id);
>  }
> diff --git a/storage/xtradb/row/row0mysql.cc b/storage/xtradb/row/row0mysql.cc
> index 9427b20..b28e778 100644
> --- a/storage/xtradb/row/row0mysql.cc
> +++ b/storage/xtradb/row/row0mysql.cc
> @@ -510,7 +510,15 @@ row_mysql_store_col_in_innobase_format(
>
>       return(buf);
>  }
> -
> +//this is just dummy function
> +int dummuy_hash(byte * ptr,int length){
> +     int val=0;
> +     for(int i=0;i<length;i++){
> +             val =val*8+(int )*ptr;
> +             ptr++;
> +     }
> +     return val;
> +}
>  /**************************************************************//**
>  Convert a row in the MySQL format to a row in the Innobase format. Note that
>  the function to convert a MySQL format key value to an InnoDB dtuple is
> @@ -532,15 +540,22 @@ row_mysql_convert_row_to_innobase(
>       const mysql_row_templ_t*templ;
>       dfield_t*               dfield;
>       ulint                   i;
> -
> +     ibool                   is_blob_primary_key=false;
> +     ibool                   is_first_template=true;
>       ut_ad(prebuilt->template_type == ROW_MYSQL_WHOLE_ROW);
> -     ut_ad(prebuilt->mysql_template);
> -
> -     for (i = 0; i < prebuilt->n_template; i++) {
> +     ut_ad(prebuilt->mysql_template);//work
> +     if(!innobase_strcasecmp(prebuilt->table->col_names,"DB_BLOB_HASH")){
> +             is_blob_primary_key =true;
> +     }
> +     for (i = 0; i < prebuilt->n_template; i++) { //change add one
>
>               templ = prebuilt->mysql_template + i;
> +             if(is_blob_primary_key&&is_first_template){
> +                     dfield = dtuple_get_nth_field(row, i+1);
> +                     is_first_template=false;
> +             }else{
>               dfield = dtuple_get_nth_field(row, i);
> -
> +             }
>               if (templ->mysql_null_bit_mask != 0) {
>                       /* Column may be SQL NULL */
>
> @@ -565,6 +580,17 @@ row_mysql_convert_row_to_innobase(
>  next_column:
>               ;
>       }
> +     if(is_blob_primary_key){
> +             //get the first field and set  the hash
> +             dfield_t * hash_field=dtuple_get_nth_field(row,0);
> +             dfield_t * blob_field=dtuple_get_nth_field(row,1);
> +             byte * hash = static_cast<byte *>(mem_heap_zalloc(prebuilt->heap,4));
> +             mach_write_to_4(hash,dummuy_hash((byte *)blob_field->data,blob_field->len));
> +             //4 is hash length
> +             hash_field->len=4;
> +             hash_field->data=hash;
> +             //hash_field->type
> +     }
>
>       /* If there is a FTS doc id column and it is not user supplied (
>       generated by server) then assign it a new doc id. */
> @@ -831,7 +857,7 @@ row_create_prebuilt(
>
>       prebuilt->search_tuple = dtuple_create(heap, search_tuple_n_fields);
>
> -     ref = dtuple_create(heap, ref_len);
> +     ref = dtuple_create(heap, ref_len);//work
>
>       dict_index_copy_types(ref, clust_index, ref_len);
>
> @@ -1057,7 +1083,7 @@ row_get_prebuilt_insert_row(
>
>       dict_table_copy_types(row, table);
>
> -     ins_node_set_new_row(node, row);
> +     ins_node_set_new_row(node, row);//explore
>
>       prebuilt->ins_graph = static_cast<que_fork_t*>(
>               que_node_get_parent(
> @@ -1353,7 +1379,7 @@ row_insert_for_mysql(
>       row_get_prebuilt_insert_row(prebuilt);
>       node = prebuilt->ins_node;
>
> -     row_mysql_convert_row_to_innobase(node->row, prebuilt, mysql_rec);
> +     row_mysql_convert_row_to_innobase(node->row, prebuilt, mysql_rec);//debug
>
>       savept = trx_savept_take(trx);
>
> diff --git a/storage/xtradb/row/row0row.cc b/storage/xtradb/row/row0row.cc
> index be786f9..a1dd465 100644
> --- a/storage/xtradb/row/row0row.cc
> +++ b/storage/xtradb/row/row0row.cc
> @@ -83,7 +83,7 @@ row_build_index_entry_low(
>                       entry, dict_index_get_n_unique_in_tree(index));
>       }
>
> -     for (i = 0; i < entry_len; i++) {
> +     for (i = 0; i < entry_len; i++) { //explore need to see how it works for simple
>               const dict_field_t*     ind_field
>                       = dict_index_get_nth_field(index, i);
>               const dict_col_t*       col



---------- Forwarded message ----------
From: Sachin Setia <sachinsetia1001@gmail.com>
To: maria-developers@lists.launchpad.net
Cc: 
Date: Wed, 13 Apr 2016 21:17:03 +0530
Subject: Re: [Maria-developers] InnoDB blob for primary key
Hello Jan!

Actually I was going through the problems raised by you

1>Firstly, existing tables will not have that column

Yes Sir, you are right that existing table woudn't have DB_BLOB_HASH
column, but this is fine because they indexed there blob column then
they must have provided blob length, and if so then length of key_part
would not be zero and so need to make changes in patch to detect this.
And also by doing I can also eliminate another problem what if existing
 table have a coloumn  named 'DB_BLOB_HASH'

2>secondly it will effect e.g maximum row size
For this i can  increase reclength by 4 in pack_header function(but i
am not sure whether this will work or not)

3>and external tools like backup
Here, I am assuming that by backup you mean mysqldump actually i tried
this by create a table with blob coloum (length was given ) and than
changing mariadb binary(to changed code) and .sql file of backup was
working fine

But I think there are more problems with this this prototype
1>Existing table with coloumn named 'DB_BLOB_HASH' .We can solve this
as written above but some code in patch compares first coloumn name
with 'DB_BLOB_HASH' so this will create problem Sir, can we add a flag
in dict_table_t->flag so that we can reliable check for blob hash coloumn

2>This is very bad problem , what happen if if data is different and crc
hash of data is same.And this will occur when we have medium amount of
data like 100 thousand records.To avoid this we can do like

2.1> use crc 64 reduces probability
2.2> use two different algorithm for hash and treat these two algoriths
hash for uniquesness.Make  collusion probability very very less

But the main problem is what would happen if collusion occurs to solve
this i have two ideas
2.3> like 2.2 but instead of using two algorithm use one and add column
for number of times this hash is repeated collusion frequency will be zero
i think this will work fine but need to test it
2.4> whole prototype is crap start again with some different method. Can you
please suggest some differnt way of doing this
Regards
sachin

On Tue, Apr 12, 2016 at 4:52 PM, Jan Lindström
<jan.lindstrom@mariadb.com> wrote:
> Hi,
>
> Sachin, naturally you may continue, this is open source, please read
> https://mariadb.com/kb/en/mariadb/community-contributing-to-the-mariadb-project/
> From InnoDB point of view there is issue if you add a new system generated
> column to the row. Firstly, existing tables will not have that column and
> secondly it will effect e.g maximum row size and external tools like backup.
>
> R: Jan
>
> On Mon, Apr 11, 2016 at 12:33 PM, Sachin Setia <sachinsetia1001@gmail.com>
> wrote:
>>
>> Hello Developers,
>> Hi this is sachin.Actually i was currently experimenting with with
>> blob uniques in innodb
>> there is three main problems
>> 1.Unique blob
>> 2.Blob Forigen key
>> 3.Blob primary key
>>
>> 1. For blob unique we can simply store hash in unclustered index
>> 2. Blob Forigen key i am currently working on it
>> 3. Blob primary key :- for this i thought we create a 4 byte column
>> which stores the hash of blob primary key.Internally this column will work
>> as
>> primary key and key for clustered index. I already successufully tested
>> this
>> here is my output
>>
>> MariaDB [sachin]> create table t4 (abc blob primary key);
>> Query OK, 0 rows affected (0.10 sec)
>>
>> MariaDB [sachin]> insert into t4 values('sachin');
>> Query OK, 1 row affected (0.01 sec)
>>
>> MariaDB [sachin]> insert into t4 values('sachin');
>> ERROR 1062 (23000): Duplicate entry 'sachin' for key 'PRIMARY'
>> MariaDB [sachin]> insert into t4 values('sachin setiya');
>> Query OK, 1 row affected (0.00 sec)
>>
>> MariaDB [sachin]> insert into t4 values('sachin setiya');
>> ERROR 1062 (23000): Duplicate entry 'sachin setiya' for key 'PRIMARY'
>>
>> MariaDB [sachin]> desc t4;
>> +-------+------+------+-----+---------+-------+
>> | Field | Type | Null | Key | Default | Extra |
>> +-------+------+------+-----+---------+-------+
>> | abc   | blob | NO   | PRI | NULL    |       |
>> +-------+------+------+-----+---------+-------+
>> 1 row in set (0.01 sec)
>>
>> MariaDB [sachin]> select * from t4;
>> +---------------+
>> | abc           |
>> +---------------+
>> | sachin        |
>> | sachin setiya |
>> +---------------+
>> 2 rows in set (0.01 sec)
>>
>> @Sergei hi! Actually i invested arround 2 months in mariadb So for me now
>> it  does not matter either i got selected in gsoc i want to do work in
>> innodb blob unique from today.Please sir allow me to do this
>>
>> I am including the patch file and t4.ibd and t4.frm file
>> Regards
>> sachin
>>
>>
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~maria-developers
>> Post to     : maria-developers@lists.launchpad.net
>> Unsubscribe : https://launchpad.net/~maria-developers
>> More help   : https://help.launchpad.net/ListHelp
>>
>


---------- Forwarded message ----------
From: Sachin Setia <sachinsetia1001@gmail.com>
To: maria-developers@lists.launchpad.net
Cc: 
Date: Wed, 13 Apr 2016 22:19:44 +0530
Subject: Re: [Maria-developers] InnoDB blob for primary key
Hello Sergei
Sorry I did not see your mail. Actually i was thinking something like
this before implementing the prototype but if i am more closer to
innodb the more performance i will i get. I will definitively think
about it.
Regards
sachin

On Wed, Apr 13, 2016 at 9:17 PM, Sachin Setia <sachinsetia1001@gmail.com> wrote:
> Hello Jan!
>
> Actually I was going through the problems raised by you
>
> 1>Firstly, existing tables will not have that column
>
> Yes Sir, you are right that existing table woudn't have DB_BLOB_HASH
> column, but this is fine because they indexed there blob column then
> they must have provided blob length, and if so then length of key_part
> would not be zero and so need to make changes in patch to detect this.
> And also by doing I can also eliminate another problem what if existing
>  table have a coloumn  named 'DB_BLOB_HASH'
>
> 2>secondly it will effect e.g maximum row size
> For this i can  increase reclength by 4 in pack_header function(but i
> am not sure whether this will work or not)
>
> 3>and external tools like backup
> Here, I am assuming that by backup you mean mysqldump actually i tried
> this by create a table with blob coloum (length was given ) and than
> changing mariadb binary(to changed code) and .sql file of backup was
> working fine
>
> But I think there are more problems with this this prototype
> 1>Existing table with coloumn named 'DB_BLOB_HASH' .We can solve this
> as written above but some code in patch compares first coloumn name
> with 'DB_BLOB_HASH' so this will create problem Sir, can we add a flag
> in dict_table_t->flag so that we can reliable check for blob hash coloumn
>
> 2>This is very bad problem , what happen if if data is different and crc
> hash of data is same.And this will occur when we have medium amount of
> data like 100 thousand records.To avoid this we can do like
>
> 2.1> use crc 64 reduces probability
> 2.2> use two different algorithm for hash and treat these two algoriths
> hash for uniquesness.Make  collusion probability very very less
>
> But the main problem is what would happen if collusion occurs to solve
> this i have two ideas
> 2.3> like 2.2 but instead of using two algorithm use one and add column
> for number of times this hash is repeated collusion frequency will be zero
> i think this will work fine but need to test it
> 2.4> whole prototype is crap start again with some different method. Can you
> please suggest some differnt way of doing this
> Regards
> sachin
>
> On Tue, Apr 12, 2016 at 4:52 PM, Jan Lindström
> <jan.lindstrom@mariadb.com> wrote:
>> Hi,
>>
>> Sachin, naturally you may continue, this is open source, please read
>> https://mariadb.com/kb/en/mariadb/community-contributing-to-the-mariadb-project/
>> From InnoDB point of view there is issue if you add a new system generated
>> column to the row. Firstly, existing tables will not have that column and
>> secondly it will effect e.g maximum row size and external tools like backup.
>>
>> R: Jan
>>
>> On Mon, Apr 11, 2016 at 12:33 PM, Sachin Setia <sachinsetia1001@gmail.com>
>> wrote:
>>>
>>> Hello Developers,
>>> Hi this is sachin.Actually i was currently experimenting with with
>>> blob uniques in innodb
>>> there is three main problems
>>> 1.Unique blob
>>> 2.Blob Forigen key
>>> 3.Blob primary key
>>>
>>> 1. For blob unique we can simply store hash in unclustered index
>>> 2. Blob Forigen key i am currently working on it
>>> 3. Blob primary key :- for this i thought we create a 4 byte column
>>> which stores the hash of blob primary key.Internally this column will work
>>> as
>>> primary key and key for clustered index. I already successufully tested
>>> this
>>> here is my output
>>>
>>> MariaDB [sachin]> create table t4 (abc blob primary key);
>>> Query OK, 0 rows affected (0.10 sec)
>>>
>>> MariaDB [sachin]> insert into t4 values('sachin');
>>> Query OK, 1 row affected (0.01 sec)
>>>
>>> MariaDB [sachin]> insert into t4 values('sachin');
>>> ERROR 1062 (23000): Duplicate entry 'sachin' for key 'PRIMARY'
>>> MariaDB [sachin]> insert into t4 values('sachin setiya');
>>> Query OK, 1 row affected (0.00 sec)
>>>
>>> MariaDB [sachin]> insert into t4 values('sachin setiya');
>>> ERROR 1062 (23000): Duplicate entry 'sachin setiya' for key 'PRIMARY'
>>>
>>> MariaDB [sachin]> desc t4;
>>> +-------+------+------+-----+---------+-------+
>>> | Field | Type | Null | Key | Default | Extra |
>>> +-------+------+------+-----+---------+-------+
>>> | abc   | blob | NO   | PRI | NULL    |       |
>>> +-------+------+------+-----+---------+-------+
>>> 1 row in set (0.01 sec)
>>>
>>> MariaDB [sachin]> select * from t4;
>>> +---------------+
>>> | abc           |
>>> +---------------+
>>> | sachin        |
>>> | sachin setiya |
>>> +---------------+
>>> 2 rows in set (0.01 sec)
>>>
>>> @Sergei hi! Actually i invested arround 2 months in mariadb So for me now
>>> it  does not matter either i got selected in gsoc i want to do work in
>>> innodb blob unique from today.Please sir allow me to do this
>>>
>>> I am including the patch file and t4.ibd and t4.frm file
>>> Regards
>>> sachin
>>>
>>>
>>>
>>> _______________________________________________
>>> Mailing list: https://launchpad.net/~maria-developers
>>> Post to     : maria-developers@lists.launchpad.net
>>> Unsubscribe : https://launchpad.net/~maria-developers
>>> More help   : https://help.launchpad.net/ListHelp
>>>
>>


---------- Forwarded message ----------
From: Sergei Golubchik <serg@mariadb.org>
To: Sachin Setia <sachinsetia1001@gmail.com>
Cc: 
Date: Thu, 14 Apr 2016 20:53:51 +0200
Subject: Re: [Maria-developers] InnoDB blob for primary key
Hi, Sachin!

On Apr 13, Sachin Setia wrote:
> Hello Sergei
> Sorry I did not see your mail. Actually i was thinking something like
> this before implementing the prototype but if i am more closer to
> innodb the more performance i will i get. I will definitively think
> about it.

Great!

Could you please tell me (mailing list, that is) what you think before
next Monday (before April 18h, that is)?

Regards,
Sergei
Chief Architect MariaDB
and security@mariadb.org


---------- Forwarded message ----------
From: Sachin Setia <sachinsetia1001@gmail.com>
To: maria-developers@lists.launchpad.net
Cc: 
Date: Fri, 15 Apr 2016 18:37:15 +0530
Subject: Re: [Maria-developers] InnoDB blob for primary key
Hi Sergei!

Actually I was going through the mysql source code for unique long constraints
in file sql_tmp_table.cc in function create_tmp_table they make a new field
and a new key(hash_key) and pass this table obejct to storage
engine.They actually
refer this field as a hash field
On the time of insert they call bool check_unique_constraint(TABLE
*table) function
which first calculate the hash and store it in field then they see for duplicate
hash and retrive ha_index_next_same if records are not same then record

We can do the same thing in mariadb by adding one more field and key in
mysql_prepare_create_table in this we check for blob with unlimited
length or varchar for length
greater then internal storage engine by doing this in
mysql_prepare_create_table there
will be no issues of frm file inconsistance.

In case of insert first we will fill the hash field in fill_record
function of sql_base.cc
by first calculating the hash. Then we will retrive the index map
using ha_index_read_map
if returened value is zero then we will comapare two records and if
they match then we will through error
I am not sure where to place this code either in fill_record or later
Or i can simple just
fill hash in field in fill_record and then check for duplicates later on.

Current I am not sure how to hide columns from user.Sir, can you
suggest me where to look

But there is one problem we can make unique key by this approch but
not primary key because primary key
is clustered and hashes can collide so i think we can't use hash field
as primary key. To overcome this problem
I have one idea instead of storing just hash we can make hash field
length 10 bytes and in last two bytes
we can store short int which tells how much time hash is repeated this
can make hash unique
in case of collusion. And also we are not doing more computation
because we already retrive
all records with same hashes.
What do you think of this idea?.
And there is one more problem how to make it foreign key.

Will send you a prototype code tomorrow.
Regards
sachin

On Fri, Apr 15, 2016 at 12:23 AM, Sergei Golubchik <serg@mariadb.org> wrote:
> Hi, Sachin!
>
> On Apr 13, Sachin Setia wrote:
>> Hello Sergei
>> Sorry I did not see your mail. Actually i was thinking something like
>> this before implementing the prototype but if i am more closer to
>> innodb the more performance i will i get. I will definitively think
>> about it.
>
> Great!
>
> Could you please tell me (mailing list, that is) what you think before
> next Monday (before April 18h, that is)?
>
> Regards,
> Sergei
> Chief Architect MariaDB
> and security@mariadb.org


---------- Forwarded message ----------
From: Sachin Setia <sachinsetia1001@gmail.com>
To: maria-developers@lists.launchpad.net
Cc: 
Date: Sat, 16 Apr 2016 19:37:45 +0530
Subject: Re: [Maria-developers] InnoDB blob for primary key
Hi Sergei!
As I already told you i was building prototype.It is some what completed
apart from one thing comparing of two field values. the difficulty is how
to get data length of field  from table->record[1]. I will try to solve it.
One more thing actually i got how mysql hide field. For example  condsider three
fields hash,data,data. mysql field pointer point at second field not at hash
field and hash field ptr is stored in  table->hash_field
can we do something similar to store hash fields(if we make array of hashes in
case of more than one unique).But will adding member variable cause problem?
what do you think?
Regards
sachin

On Fri, Apr 15, 2016 at 6:37 PM, Sachin Setia <sachinsetia1001@gmail.com> wrote:
> Hi Sergei!
>
> Actually I was going through the mysql source code for unique long constraints
> in file sql_tmp_table.cc in function create_tmp_table they make a new field
> and a new key(hash_key) and pass this table obejct to storage
> engine.They actually
> refer this field as a hash field
> On the time of insert they call bool check_unique_constraint(TABLE
> *table) function
> which first calculate the hash and store it in field then they see for duplicate
> hash and retrive ha_index_next_same if records are not same then record
>
> We can do the same thing in mariadb by adding one more field and key in
> mysql_prepare_create_table in this we check for blob with unlimited
> length or varchar for length
> greater then internal storage engine by doing this in
> mysql_prepare_create_table there
> will be no issues of frm file inconsistance.
>
> In case of insert first we will fill the hash field in fill_record
> function of sql_base.cc
> by first calculating the hash. Then we will retrive the index map
> using ha_index_read_map
> if returened value is zero then we will comapare two records and if
> they match then we will through error
> I am not sure where to place this code either in fill_record or later
> Or i can simple just
> fill hash in field in fill_record and then check for duplicates later on.
>
> Current I am not sure how to hide columns from user.Sir, can you
> suggest me where to look
>
> But there is one problem we can make unique key by this approch but
> not primary key because primary key
> is clustered and hashes can collide so i think we can't use hash field
> as primary key. To overcome this problem
> I have one idea instead of storing just hash we can make hash field
> length 10 bytes and in last two bytes
> we can store short int which tells how much time hash is repeated this
> can make hash unique
> in case of collusion. And also we are not doing more computation
> because we already retrive
> all records with same hashes.
> What do you think of this idea?.
> And there is one more problem how to make it foreign key.
>
> Will send you a prototype code tomorrow.
> Regards
> sachin
>
> On Fri, Apr 15, 2016 at 12:23 AM, Sergei Golubchik <serg@mariadb.org> wrote:
>> Hi, Sachin!
>>
>> On Apr 13, Sachin Setia wrote:
>>> Hello Sergei
>>> Sorry I did not see your mail. Actually i was thinking something like
>>> this before implementing the prototype but if i am more closer to
>>> innodb the more performance i will i get. I will definitively think
>>> about it.
>>
>> Great!
>>
>> Could you please tell me (mailing list, that is) what you think before
>> next Monday (before April 18h, that is)?
>>
>> Regards,
>> Sergei
>> Chief Architect MariaDB
>> and security@mariadb.org


---------- Forwarded message ----------
From: Sachin Setia <sachinsetia1001@gmail.com>
To: maria-developers@lists.launchpad.net
Cc: 
Date: Mon, 18 Apr 2016 20:12:39 +0530
Subject: Re: [Maria-developers] InnoDB blob for primary key
Hello Sergei!
Hi Actually I was thinking about how to implement 
blob as a foreign key.Foreign has to be unique which 
we  can already  implement. To make it foreign key
we can either store unique hash or the whole blob
column.
But I am not sure much people 
want to copy so long blob data in reference table.
Second option would be use blob hash as a reference key.
But user can not directly  us hash as a reference key
because that is hidden.
What I was thinking of a clear to programmer way of 
using blob hash. Suppose user can directly create 
blob hash column ,use that column as a primary key or 
foreign key.Like
create table t1(abc blob , blob hash(abc))//this will just create blob hash column
create table t1(abc blob,unique(blob hash(abc))) // this will create unique blob hash column
and similar for primary key and foreign key
user can enter hash value if they have some good algorithm
or if they do not give any value we will automatically 
create and store hash. What do you think? sir.
Regards 
sachin



On Sat, Apr 16, 2016 at 7:37 PM, Sachin Setia <sachinsetia1001@gmail.com> wrote:
Hi Sergei!
As I already told you i was building prototype.It is some what completed
apart from one thing comparing of two field values. the difficulty is how
to get data length of field  from table->record[1]. I will try to solve it.
One more thing actually i got how mysql hide field. For example  condsider three
fields hash,data,data. mysql field pointer point at second field not at hash
field and hash field ptr is stored in  table->hash_field
can we do something similar to store hash fields(if we make array of hashes in
case of more than one unique).But will adding member variable cause problem?
what do you think?
Regards
sachin

On Fri, Apr 15, 2016 at 6:37 PM, Sachin Setia <sachinsetia1001@gmail.com> wrote:
> Hi Sergei!
>
> Actually I was going through the mysql source code for unique long constraints
> in file sql_tmp_table.cc in function create_tmp_table they make a new field
> and a new key(hash_key) and pass this table obejct to storage
> engine.They actually
> refer this field as a hash field
> On the time of insert they call bool check_unique_constraint(TABLE
> *table) function
> which first calculate the hash and store it in field then they see for duplicate
> hash and retrive ha_index_next_same if records are not same then record
>
> We can do the same thing in mariadb by adding one more field and key in
> mysql_prepare_create_table in this we check for blob with unlimited
> length or varchar for length
> greater then internal storage engine by doing this in
> mysql_prepare_create_table there
> will be no issues of frm file inconsistance.
>
> In case of insert first we will fill the hash field in fill_record
> function of sql_base.cc
> by first calculating the hash. Then we will retrive the index map
> using ha_index_read_map
> if returened value is zero then we will comapare two records and if
> they match then we will through error
> I am not sure where to place this code either in fill_record or later
> Or i can simple just
> fill hash in field in fill_record and then check for duplicates later on.
>
> Current I am not sure how to hide columns from user.Sir, can you
> suggest me where to look
>
> But there is one problem we can make unique key by this approch but
> not primary key because primary key
> is clustered and hashes can collide so i think we can't use hash field
> as primary key. To overcome this problem
> I have one idea instead of storing just hash we can make hash field
> length 10 bytes and in last two bytes
> we can store short int which tells how much time hash is repeated this
> can make hash unique
> in case of collusion. And also we are not doing more computation
> because we already retrive
> all records with same hashes.
> What do you think of this idea?.
> And there is one more problem how to make it foreign key.
>
> Will send you a prototype code tomorrow.
> Regards
> sachin
>
> On Fri, Apr 15, 2016 at 12:23 AM, Sergei Golubchik <serg@mariadb.org> wrote:
>> Hi, Sachin!
>>
>> On Apr 13, Sachin Setia wrote:
>>> Hello Sergei
>>> Sorry I did not see your mail. Actually i was thinking something like
>>> this before implementing the prototype but if i am more closer to
>>> innodb the more performance i will i get. I will definitively think
>>> about it.
>>
>> Great!
>>
>> Could you please tell me (mailing list, that is) what you think before
>> next Monday (before April 18h, that is)?
>>
>> Regards,
>> Sergei
>> Chief Architect MariaDB
>> and security@mariadb.org



---------- Forwarded message ----------
From: Sergei Golubchik <serg@mariadb.org>
To: Sachin Setia <sachinsetia1001@gmail.com>
Cc: maria-developers@lists.launchpad.net
Date: Mon, 18 Apr 2016 22:32:25 +0200
Subject: Re: [Maria-developers] InnoDB blob for primary key
Hi, Sachin!

On Apr 15, Sachin Setia wrote:
> Hi Sergei!
>
> Actually I was going through the mysql source code for unique long
> constraints in file sql_tmp_table.cc in function create_tmp_table they
> make a new field and a new key(hash_key) and pass this table obejct to
> storage engine.They actually refer this field as a hash field On the
> time of insert they call bool check_unique_constraint(TABLE *table)
> function which first calculate the hash and store it in field then
> they see for duplicate hash and retrive ha_index_next_same if records
> are not same then record

Right. Very good!

> We can do the same thing in mariadb by adding one more field and key in
> mysql_prepare_create_table in this we check for blob with unlimited
> length or varchar for length greater then internal storage engine by
> doing this in mysql_prepare_create_table there will be no issues of frm
> file inconsistance.
>
> In case of insert first we will fill the hash field in fill_record
> function of sql_base.cc by first calculating the hash. Then we will
> retrive the index map using ha_index_read_map if returened value is zero
> then we will comapare two records and if they match then we will through
> error I am not sure where to place this code either in fill_record or
> later Or i can simple just fill hash in field in fill_record and then
> check for duplicates later on.

MariaDB supports "virtual columns", see
https://mariadb.com/kb/en/mariadb/virtual-computed-columns/

So, it might be enough to mark this hash column as virtual, and will be
automatically calculated when necessary (in fill_record, etc).

> Current I am not sure how to hide columns from user.Sir, can you suggest
> me where to look

This would need a new flag per field, like "hidden". And in all commands
that use a list of columns (SHOW, INFORMATION_SCHEMA tables, SELECT *,
INSERT table VALUE (...), etc) - this column should be skipped.

But don't worry, I don't think this will be a problem for you. You can,
of course, start from not hiding this column and implement this hiddden
flag later.

> But there is one problem we can make unique key by this approch but not
> primary key because primary key is clustered and hashes can collide so i
> think we can't use hash field as primary key.

Of course.

> To overcome this problem I have one idea instead of storing just hash
> we can make hash field length 10 bytes and in last two bytes we can
> store short int which tells how much time hash is repeated this can
> make hash unique in case of collusion. And also we are not doing more
> computation because we already retrive all records with same hashes.
> What do you think of this idea?.  And there is one more problem how to
> make it foreign key.

I would say, there is no need to bother. Let's just say that UNIQUE for
blobs is only a constraint, can not be used as a primary key, can not be
used as a foreign key. This is a reasonable limitation, I think :)

Regards,
Sergei
Chief Architect MariaDB
and security@mariadb.org


---------- Forwarded message ----------
From: Sergei Golubchik <serg@mariadb.org>
To: Sachin Setia <sachinsetia1001@gmail.com>
Cc: maria-developers@lists.launchpad.net
Date: Mon, 18 Apr 2016 22:46:42 +0200
Subject: Re: [Maria-developers] InnoDB blob for primary key
Hi, Sachin!

On Apr 16, Sachin Setia wrote:
> Hi Sergei!
> As I already told you i was building prototype.It is some what
> completed apart from one thing comparing of two field values. the
> difficulty is how to get data length of field  from table->record[1].
> I will try to solve it.

Do you mean, a length of the blob value?
Try field->get_length()

> One more thing actually i got how mysql hide field. For example
> condsider three fields hash,data,data. mysql field pointer point at
> second field not at hash field and hash field ptr is stored in
> table->hash_field can we do something similar to store hash fields(if
> we make array of hashes in case of more than one unique).But will
> adding member variable cause problem?  what do you think?

Well, yes, that would work.
But I see few other features where field hiding might be useful,
and in these cases this simple trick with moving table->field pointer
will not work, unfortunately.

So we might need something more complex. But, again, this is not a most
important part of this project, so don't start from it.

In fact, I'm rather interested to know about the adaptive hashing that
you've mentioned. Can you tell me more about it, please?

> diff --git a/sql/sql_base.cc b/sql/sql_base.cc
> index ac2162b..291a3e2 100644
> --- a/sql/sql_base.cc
> +++ b/sql/sql_base.cc
> @@ -8956,6 +8964,31 @@ fill_record(THD *thd, TABLE *table, Field **ptr, List<Item> &values,
>          goto err;
>      field->set_explicit_default(value);
>    }
> +  table->file->ha_index_init(0,0);
> + res = table->file->ha_index_read_map(table->record[1],hash_field->ptr,HA_WHOLE_KEY,HA_READ_KEY_EXACT);
> +  while(!res){
> +       //compare the record if not sure how to compare it so just assume it works
> +              diff = table->record[1]-table->record[0];
> +              src_length = blob_field->data_length();
> +              //dest_length = blob_field->data_length(table->record[1]); // i  dont know how to get the length from record 1
> +              // so i am enable to do this
> +              // then we can comapare records using
> +              //field->cmp_max
> + //this is mysql code
> +     /*  if (!(table->distinct ?
> +          table_rec_cmp(table) :
> +          group_rec_cmp(table->group, table->record[0], table->record[1])))
> +      return false; // skip it
> +    res= table->file->ha_index_next_same(table->record[1],
> +                                         table->hash_field->ptr,
> +                                         sizeof(hash)); */
> +       //fetch the next record
> +       res= table->file->ha_index_next_same(table->record[1],
> +                                         hash_field->ptr,
> +                                         8);
> +
> +  }

not quite, you should check unique constraints in fill_record(). This
should happen when the row is actually inserted or updated. A good place
for this check, I'd say, is handler::ha_write_row() and
handler::ha_update_row().

Regards,
Sergei
Chief Architect MariaDB
and security@mariadb.org


---------- Forwarded message ----------
From: Sergei Golubchik <serg@mariadb.org>
To: Sachin Setia <sachinsetia1001@gmail.com>
Cc: maria-developers@lists.launchpad.net
Date: Mon, 18 Apr 2016 22:51:57 +0200
Subject: Re: [Maria-developers] InnoDB blob for primary key
Hi, Sachin!

On Apr 18, Sachin Setia wrote:
> Hi Actually I was thinking about how to implement blob as a foreign
> key.Foreign has to be unique which we  can already  implement. To make
> it foreign key we can either store unique hash or the whole blob
> column.  But I am not sure much people want to copy so long blob data
> in reference table.

Agree :)

> Second option would be use blob hash as a reference key.
> But user can not directly  us hash as a reference key
> because that is hidden.
> What I was thinking of a clear to programmer way of
> using blob hash. Suppose user can directly create
> blob hash column ,use that column as a primary key or
> foreign key.Like
> create table t1(abc blob , blob hash(abc))//this will just create blob hash
> column
> create table t1(abc blob,unique(blob hash(abc))) // this will create unique
> blob hash column
> and similar for primary key and foreign key
> user can enter hash value if they have some good algorithm
> or if they do not give any value we will automatically
> create and store hash. What do you think? sir.

Mixed feelings. First, I wrote in an earlier email about using virtual
columns for that. In this line of thoughts, a user-specified hash
function is absolutely possible and logical.

On the other hand, I don't see why anyone would need that - a hash is
not guaranteed to be unique, no matter what algorithm one uses.

And, as I wrote in an earlier email, I don't think that primary/foreign
keys on blobs is a feature worth spending time on.

Regards,
Sergei
Chief Architect MariaDB
and security@mariadb.org