[Maria-developers] write_set.diff review
Hi, Summary: 1. Three questions (are you sure, what guarantee, why) 2. A couple of suggestions (spilt a commit, move function call out of DBUG_RETURN) 3. Two suggestions for unrelated optimizations (don't allocate unneded bitmaps, don't modify read_set)
commit 2734eca05d499bcbcaf428d04d4f1acbb298589f Author: Monty <monty@mariadb.org> Date: Mon Nov 2 11:52:06 2015 +0200
diff --git a/sql/table.h b/sql/table.h index b150e4f..6ecfdef 100644 --- a/sql/table.h +++ b/sql/table.h @@ -1066,9 +1066,11 @@ struct TABLE String alias; /* alias or table name */ uchar *null_flags; MY_BITMAP def_read_set, def_write_set, def_vcol_set, tmp_set; + MY_BITMAP def_rpl_write_set;
I would rather not allocate more bitmaps directly in the TABLE. They are not always needed, and only increase the TABLE size. The same is true for def_vcol_set, in fact, it's *even more true* for vcols. Instead I'd allocate them in the TABLE's memroot as needed.
MY_BITMAP eq_join_set; /* used to mark equi-joined fields */ MY_BITMAP cond_set; /* used to mark fields from sargable conditions*/ - MY_BITMAP *read_set, *write_set, *vcol_set; /* Active column sets */ + /* Active column sets */ + MY_BITMAP *read_set, *write_set, *vcol_set, *rpl_write_set; /* The ID of the query that opened and is using this table. Has different meanings depending on the table type. diff --git a/sql/table.cc b/sql/table.cc index 76cb4b8..8816ee0 100644 --- a/sql/table.cc +++ b/sql/table.cc @@ -2884,20 +2884,28 @@ enum open_frm_error open_table_from_share(THD *thd, TABLE_SHARE *share, /* Allocate bitmaps */
bitmap_size= share->column_bitmap_size; - if (!(bitmaps= (uchar*) alloc_root(&outparam->mem_root, bitmap_size*6))) + if (!(bitmaps= (uchar*) alloc_root(&outparam->mem_root, bitmap_size*7))) goto err; my_bitmap_init(&outparam->def_read_set, (my_bitmap_map*) bitmaps, share->fields, FALSE); my_bitmap_init(&outparam->def_write_set, - (my_bitmap_map*) (bitmaps+bitmap_size), share->fields, FALSE); + (my_bitmap_map*) (bitmaps+bitmap_size), share->fields, + FALSE); my_bitmap_init(&outparam->def_vcol_set, - (my_bitmap_map*) (bitmaps+bitmap_size*2), share->fields, FALSE); + (my_bitmap_map*) (bitmaps+bitmap_size*2), share->fields, + FALSE);
This is totally wasted memory if the table has no virtual columns. (that's what I meant by *even more true* - rpl bitmap and eq_join_set, cond_set bitmaps may or may not be used for this table, one never knows. But for vcols it's always known in advance whether the table has them or not). Still, I'd allocate other bitmaps on demand too.
my_bitmap_init(&outparam->tmp_set, - (my_bitmap_map*) (bitmaps+bitmap_size*3), share->fields, FALSE); + (my_bitmap_map*) (bitmaps+bitmap_size*3), share->fields, + FALSE); my_bitmap_init(&outparam->eq_join_set, - (my_bitmap_map*) (bitmaps+bitmap_size*4), share->fields, FALSE); + (my_bitmap_map*) (bitmaps+bitmap_size*4), share->fields, + FALSE); my_bitmap_init(&outparam->cond_set, - (my_bitmap_map*) (bitmaps+bitmap_size*5), share->fields, FALSE); + (my_bitmap_map*) (bitmaps+bitmap_size*5), share->fields, + FALSE); + my_bitmap_init(&outparam->def_rpl_write_set, + (my_bitmap_map*) (bitmaps+bitmap_size*6), share->fields, + FALSE); outparam->default_column_bitmaps();
outparam->cond_selectivity= 1.0; @@ -5986,36 +5997,48 @@ void TABLE::mark_columns_needed_for_insert() we only want to log a PK and we needed other fields for execution). */ + void TABLE::mark_columns_per_binlog_row_image() { + THD *thd= in_use; DBUG_ENTER("mark_columns_per_binlog_row_image"); DBUG_ASSERT(read_set->bitmap); DBUG_ASSERT(write_set->bitmap);
- THD *thd= current_thd;
Hmm. Are you sure that TABLE::in_use is *always* set at this point? You've basically replaced - THD *thd=current_thd; - if (in_use && in_use->is_current_stmt_binlog_format_row) + THD *thd=in_use; + if (thd->is_current_stmt_binlog_format_row) And, btw, where's your MDEV for caching rbr checks per statement?
+ /* If not using row format */ + rpl_write_set= write_set;
/** If in RBR we may need to mark some extra columns, depending on the binlog-row-image command line argument. */ if ((WSREP_EMULATE_BINLOG(thd) || mysql_bin_log.is_open()) && - in_use && - in_use->is_current_stmt_binlog_format_row() && + thd->is_current_stmt_binlog_format_row() && !ha_check_storage_engine_flag(s->db_type(), HTON_NO_BINLOG_ROW_OPT)) { /* if there is no PK, then mark all columns for the BI. */ if (s->primary_key >= MAX_KEY) + { bitmap_set_all(read_set); + rpl_write_set= read_set; + } + else + { switch (thd->variables.binlog_row_image) { case BINLOG_ROW_IMAGE_FULL: - if (s->primary_key < MAX_KEY) bitmap_set_all(read_set); - bitmap_set_all(write_set); + /* Set of columns that should be written (all) */ + rpl_write_set= read_set; break; case BINLOG_ROW_IMAGE_NOBLOB: - /* for every field that is not set, mark it unless it is a blob */ + /* Only write changed columns + not blobs */ + rpl_write_set= &def_rpl_write_set; + bitmap_copy(rpl_write_set, write_set); + + /* + for every field that is not set, mark it unless it is a blob or + part of a primary key + */ for (Field **ptr=field ; *ptr ; ptr++) { Field *my_field= *ptr; @@ -6027,23 +6050,24 @@ void TABLE::mark_columns_per_binlog_row_image() If set in the AI, then the blob is really needed, there is nothing we can do about it. */ - if ((s->primary_key < MAX_KEY) && - ((my_field->flags & PRI_KEY_FLAG) || - (my_field->type() != MYSQL_TYPE_BLOB))) + if ((my_field->flags & PRI_KEY_FLAG) || + (my_field->type() != MYSQL_TYPE_BLOB)) + { bitmap_set_bit(read_set, my_field->field_index); - - if (my_field->type() != MYSQL_TYPE_BLOB) - bitmap_set_bit(write_set, my_field->field_index); + bitmap_set_bit(rpl_write_set, my_field->field_index); } + } break; case BINLOG_ROW_IMAGE_MINIMAL: - /* mark the primary key if available in the read_set */ - if (s->primary_key < MAX_KEY) + /* mark the primary key */ mark_columns_used_by_index_no_reset(s->primary_key, read_set); + /* Only write columns that have changed */ + rpl_write_set= write_set;
Where's the guarantee that the primary key is in the write set?
break;
default: DBUG_ASSERT(FALSE); + } } file->column_bitmaps_signal(); } diff --git a/sql/opt_range.cc b/sql/opt_range.cc index 860426c..6201065 100644 --- a/sql/opt_range.cc +++ b/sql/opt_range.cc @@ -11140,26 +11138,21 @@ int QUICK_RANGE_SELECT::reset() int QUICK_RANGE_SELECT::get_next() { range_id_t dummy; + int result; + DBUG_ENTER("QUICK_RANGE_SELECT::get_next"); + + if (!in_ror_merged_scan) + DBUG_RETURN(file->multi_range_read_next(&dummy));
please don't put function calls inside DBUG_RETURN, always use a variable to store the return value first
+ MY_BITMAP * const save_read_set= head->read_set; MY_BITMAP * const save_write_set= head->write_set; - - DBUG_ENTER("QUICK_RANGE_SELECT::get_next"); - if (in_ror_merged_scan) - { /* We don't need to signal the bitmap change as the bitmap is always the same for this head->file */ head->column_bitmaps_set_no_signal(&column_bitmap, &column_bitmap); - } - - int result= file->multi_range_read_next(&dummy); + result= file->multi_range_read_next(&dummy); - - if (in_ror_merged_scan) - { - /* Restore bitmaps set on entry */ head->column_bitmaps_set_no_signal(save_read_set, save_write_set); - } DBUG_RETURN(result); }
Ok, all changes in opt_range.cc are unrelated to your rpl_write_set patch, please commit them separately
diff --git a/sql/rpl_record.cc b/sql/rpl_record.cc index ee222f0..208b2b6 100644 --- a/sql/rpl_record.cc +++ b/sql/rpl_record.cc @@ -434,7 +433,7 @@ unpack_row(rpl_group_info *rgi, *master_reclength = table->s->reclength; }
- DBUG_RETURN(error); + DBUG_RETURN(0); }
/**
also unrelated to rpl_write_set. If you'd like you can put it in the same commit with opt_range.cc changes, with the sybject "cleanups" or something
diff --git a/sql/sql_class.cc b/sql/sql_class.cc index 5054357..b238e89 100644 --- a/sql/sql_class.cc +++ b/sql/sql_class.cc @@ -6259,11 +6248,11 @@ int THD::binlog_delete_row(TABLE* table, bool is_trans, DBUG_ASSERT(is_current_stmt_binlog_format_row() && ((WSREP(this) && wsrep_emulate_bin_log) || mysql_bin_log.is_open())); /** - Save a reference to the original read and write set bitmaps. - We will need this to restore the bitmaps at the end. + Save a reference to the original read bitmaps + We will need this to restore the bitmaps at the end as + binlog_prepare_row_images() may changetable->read_set. */
This doesn't make much sense. Here, the code is like that THD::binlog_delete_row: save table->read_set THD::binlog_prepare_row_images calculate table->tmp_set table->read_set=table->tmp_set pack_row(table->read_set) for every set bit in the set_arg call field->pack() restore table->read_set So, two thoughts here: 1. tmp_set must be a strict subset of read_set! other fields may not be present in the record[0] and there're no row reads between binlog_prepare_row_images and field->pack 2. One can pass table->tmp_set directly into pack_row() as an argument which means there's no need for binlog_prepare_row_images to modify table->read_set at all
MY_BITMAP *old_read_set= table->read_set; - MY_BITMAP *old_write_set= table->write_set;
/** This will remove spurious fields required during execution but diff --git a/sql/sql_insert.cc b/sql/sql_insert.cc index 488826a..0661e70 100644 --- a/sql/sql_insert.cc +++ b/sql/sql_insert.cc @@ -2867,5 +2867,7 @@ pthread_handler_t handle_delayed_insert(void *arg) /* Tell client that the thread is initialized */ mysql_cond_signal(&di->cond_client);
+ di->table->mark_columns_needed_for_insert();
why?
+ /* Now wait until we get an insert or lock to handle */ /* We will not abort as long as a client thread uses this thread */
Regards, Sergei
participants (1)
-
Sergei Golubchik